IT

OpenAI모델을 활용한 자동화된 뉴스레터 생성소개

esmile1 2024. 10. 2. 05:43

 

AI 에이전트 시스템에 대한 내용을 정리하였습니다.

 

1. AI 에이전트 시스템:

 

1) OpenAI 모델을 활용한 자동화된 뉴스레터 생성 소개

 

최근 AI 기술의 발전으로 다양한 작업을 자동화할 수 있게 되었습니다. 이 글에서는 OpenAI의 모델을 활용하여 여러 AI 에이전트가 협력하여 자동으로 뉴스레터를 생성하고 배포하는 시스템에 대해 소개하고자 합니다. 이 시스템은 웹 스크래핑부터 콘텐츠 작성, 음성 변환, 그리고 최종 뉴스레터 배포까지 전 과정을 자동화합니다.

 

2) 시스템 구조

 

이 AI 에이전트 시스템은 다음과 같은 구조로 이루어져 있습니다:

 

웹 에이전트 (Web Agent)

재작성 에이전트 (Rewrite Agent)

음성 에이전트 (Voice Agent)

뉴스레터 에이전트 (Newsletter Agent)

마스터마인드 (Mastermind)

 

각 에이전트는 특정 작업을 담당하며, 마스터마인드가 전체 프로세스를 조율합니다.

 

3) 에이전트 상세 설명

 

(1) 웹 에이전트 (Web Agent)

웹 에이전트의 주요 역할은 웹에서 정보를 수집하는 것입니다. 이 프로젝트에서는 The Verge 웹사이트의 "Most Popular" 섹션에서 기사를 스크래핑합니다.

XPath를 사용하여 특정 섹션을 타겟팅

여러 기사에서 비정형 데이터 수집

수집된 정보를 마스터마인드에 보고

 

(2) 재작성 에이전트 (Rewrite Agent)

재작성 에이전트는 웹 에이전트가 수집한 비정형 데이터를 구조화된 형식으로 변환합니다.

GPT-4 모델 사용

주의를 끄는 소개문 작성

일관된 포맷 유지

이미지 URL과 기사 URL 포함

 

(3) 음성 에이전트 (Voice Agent)

음성 에이전트는 작성된 기사를 음성으로 변환합니다.

ElevenLabs API 사용

MP3 파일 생성

클라우드 스토리지에 오디오 파일 업로드

뉴스레터에 삽입할 오디오 콘텐츠 URL 제공

 

(4) 뉴스레터 에이전트 (Newsletter Agent)

뉴스레터 에이전트는 최종 뉴스레터 형태를 만듭니다.

이메일 뉴스레터용 HTML 템플릿 생성

기사, 이미지, 오디오 링크를 시각적으로 매력적인 형식으로 통합

미리 정의된 구조와 스타일 가이드라인 준수

 

(5) 마스터마인드 (Mastermind)

마스터마인드는 OpenAI의 GPT-4 모델을 사용하여 전체 프로세스를 관리하고 조율합니다.

각 에이전트의 출력 품질 평가

개선 제안 및 피드백 제공

다음 단계로 진행할지 결정

 

4) 시스템 작동 과정

웹 에이전트가 The Verge에서 기사 스크래핑

마스터마인드가 수집된 정보 평가

재작성 에이전트가 기사 재구성

음성 에이전트가 오디오 버전 생성

뉴스레터 에이전트가 HTML 템플릿 생성

마스터마인드가 최종 결과물 검토

API를 통해 구독자에게 뉴스레터 발송

 

5) 코드 구현

 

이제 각 에이전트의 코드 구현에 대해 살펴보겠습니다.

 

웹 에이전트 코드

# 웹 에이전트 코드 (간략화) import requests from lxml import html def scrape_verge(): url = "<https://www.theverge.com>" response = requests.get(url) tree = html.fromstring(response.content) # XPath를 사용하여 "Most Popular" 섹션 타겟팅 articles = tree.xpath('//div[@class="c-compact-river__entry"]') content = "" for article in articles: title = article.xpath('.//h2/text()')[0] link = article.xpath('.//a/@href')[0] content += f"{title}\\\\n{link}\\\\n\\\\n" with open("content.txt", "w") as f: f.write(content) scrape_verge()

이 코드는 The Verge 웹사이트에서 "Most Popular" 섹션의 기사를 스크래핑하여 content.txt 파일에 저장합니다.

 

재작성 에이전트 코드

# 재작성 에이전트 코드 (간략화) import openai def rewrite_content(): with open("content.txt", "r") as f: content = f.read() prompt = f""" Create a cohesive set of short, captivating stories from the news articles. Include image URLs and article URLs. Begin with a very brief, attention-grabbing introduction. Content: {content} """ response = openai.ChatCompletion.create( model="gpt-4", messages=[{"role": "user", "content": prompt}] ) stories = response.choices[0].message.content with open("stories.txt", "w") as f: f.write(stories) rewrite_content()

이 코드는 GPT-4 모델을 사용하여 스크래핑된 콘텐츠를 재구성하고 stories.txt 파일에 저장합니다.

 

음성 에이전트 코드

# 음성 에이전트 코드 (간략화) from elevenlabs import generate, save import azure.storage.blob as azureblob def create_audio(): with open("stories.txt", "r") as f: stories = f.read() audio = generate( text=stories, voice="Josh", model="eleven_monolingual_v1" ) save(audio, "newsletter_audio.mp3") # Azure Blob Storage에 업로드 connection_string = "your_azure_connection_string" container_name = "your_container_name" blob_name = "newsletter_audio.mp3" blob_service_client = azureblob.BlobServiceClient.from_connection_string(connection_string) container_client = blob_service_client.get_container_client(container_name) blob_client = container_client.get_blob_client(blob_name) with open("newsletter_audio.mp3", "rb") as data: blob_client.upload_blob(data, overwrite=True) audio_url = blob_client.url print(f"Audio URL: {audio_url}") create_audio()

이 코드는 ElevenLabs API를 사용하여 텍스트를 음성으로 변환하고, Azure Blob Storage에 업로드합니다.

 

뉴스레터 에이전트 코드

# 뉴스레터 에이전트 코드 (간략화) import openai def create_newsletter(): with open("stories.txt", "r") as f: stories = f.read() with open("newsletter_template.html", "r") as f: template = f.read() prompt = f""" Use the following template to create an HTML newsletter: {template} Content to include: {stories} Generate the newsletter content in HTML format, following the template structure. """ response = openai.ChatCompletion.create( model="gpt-4", messages=[{"role": "user", "content": prompt}] ) newsletter_html = response.choices[0].message.content with open("newsletter.html", "w") as f: f.write(newsletter_html) create_newsletter()

이 코드는 GPT-4 모델을 사용하여 HTML 형식의 뉴스레터를 생성하고 newsletter.html 파일에 저장합니다.

 

마스터마인드 구현

마스터마인드는 전체 프로세스를 조율하는 중요한 역할을 합니다. 다음은 마스터마인드의 주요 기능을 구현한 코드입니다:

import openai import datetime def analyze_content(content, agent_role): current_time = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S") prompt = f""" Current date and time: {current_time} You are the Mastermind overseeing the newsletter creation process. Analyze the content produced by the {agent_role}. Address any issues related to quality, engagement, and relevance. Identify issues, suggest improvements, and if suitable, confirm that it meets standards. Content: {content} """ response = openai.ChatCompletion.create( model="gpt-4", messages=[{"role": "user", "content": prompt}] ) analysis = response.choices[0].message.content return analysis def run_agent(command, agent_name): # 각 에이전트 실행 로직 pass def main(): # 웹 에이전트 실행 run_agent("python web_agent.py", "Web Agent") with open("content.txt", "r") as f: content = f.read() # 웹 에이전트 결과 분석 web_analysis = analyze_content(content, "Web Agent") print(web_analysis) if "suggest improvements" not in web_analysis.lower() and "issues" not in web_analysis.lower(): # 재작성 에이전트 실행 run_agent("python rewrite_agent.py", "Rewrite Agent") with open("stories.txt", "r") as f: stories = f.read() # 재작성 에이전트 결과 분석 rewrite_analysis = analyze_content(stories, "Rewrite Agent") print(rewrite_analysis) if "suggest improvements" not in rewrite_analysis.lower() and "weaknesses" not in rewrite_analysis.lower(): # 음성 에이전트 실행 run_agent("python voice_agent.py", "Voice Agent") # 뉴스레터 에이전트 실행 run_agent("python newsletter_agent.py", "Newsletter Agent") with open("newsletter.html", "r") as f: newsletter = f.read() # 뉴스레터 에이전트 결과 분석 newsletter_analysis = analyze_content(newsletter, "Newsletter Agent") print(newsletter_analysis) if "suggest improvements" not in newsletter_analysis.lower() and "issues" not in newsletter_analysis.lower(): # 뉴스레터 발송 run_agent("node send_newsletter.js", "Send Newsletter") print("Newsletter sent successfully!") else: print("Newsletter needs improvements. Process halted.") else: print("Stories need improvements. Process halted.") else: print("Web content needs improvements. Process halted.") if __name__ == "__main__": main()

이 코드는 각 단계마다 결과를 분석하고, 필요한 경우 프로세스를 중단하거나 다음 단계로 진행합니다.

 

6) 시스템 실행 및 결과

이 시스템을 실행하면 다음과 같은 과정을 거칩니다:

웹 에이전트가 The Verge에서 기사를 스크래핑합니다.

마스터마인드가 스크래핑된 내용을 분석하고 평가합니다.

재작성 에이전트가 기사를 재구성합니다.

마스터마인드가 재구성된 기사를 평가합니다.

음성 에이전트가 오디오 버전을 생성합니다.

뉴스레터 에이전트가 HTML 형식의 뉴스레터를 생성합니다.

마스터마인드가 최종 뉴스레터를 평가합니다.

모든 단계가 승인되면 뉴스레터가 구독자에게 발송됩니다.

 

결과물 예시

 

최종적으로 생성된 뉴스레터는 다음과 같은 형태를 가집니다:

제목: "Tech Insights You Can't Miss"

인사말: "Hello Tech Enthusiast, dive into today's newsletter"

 

< 자료원 >

 

My First Successful AI Agent Ish Project feat OpenAI-o1! - YouTube

https://www.youtube.com/watch?v=1eYFbLgpurw

 

Transcript:

(00:00) what you see here is kind of the first system that I have to consistently produce the results I want uh we are featuring open a io1 models in this uh but I'm going to walk you through how actually I set up all these agents and how they kind of work and how they collaborate together so I think we just going to head over there and I'm going to start explaining to you how I set this up and how you yeah can use it too before we dive into the code and how this actually work and what results it produces I just want to go through kind

(00:29) of the work here to kind of explain how I wanted to set this up so you can see here I have four different agents and we have this Mastermind that is going to be run by open ai1 preview that is kind of the latest biggest model right so the first agent is going to be the web agent so this is responsible for getting on the web collecting the information we want right and it's going to report back to the Mastermind when it has uh fetch the information it need and The Mastermind is going to evaluate is this good enough to proceed with if yes we

(01:08) move on if not it's going to send a message back uh this was not good enough do something else right that is basically the idea but let's say this was good enough now then we can kind of move on to our second agent and that is going to be kind of the writer that is going to rewrite uh the kind of unstructured data we collected from the webite into uh yeah more structured format and here it usually goes a bit back and forward right because usually the 01 model here wants to do some changes it want to Yeah clean up and yeah you will

(01:45) see what kind of suggestions it has to improve this and when this is approved uh we actually going to move over to the voice agent so this has the responsibility of taking this yeah the short stories here we're going to write the new stories and turn them into a script that fits well to turn into a voice version of this that we are going to generate this has almost no uh this sends the signal let's go do this but I haven't really seen it try to correct this maybe one time I saw it made some suggestions to improve it but usually

(02:26) this works pretty well by using GT4 or mini I haven't seen any big issues here and the final agent we have is the newsletter agent so this is the final form so this is the news Le agent uh this agent is responsible for actually taking uh the stories we have and turning them into an HTML template that we can send as a newsletter uh as you will see pretty soon and this also uh when our big Mastermind o1 model sends this instructions to this agent we usually go a bit back and forward because it usually wants to change a few

(03:06) things uh but when the 01 Mastermind is happy with the final form here uh we have one more thing and this is kind of set in the middle here right so then we are going to actually send the newslet oops you're going to send the newsletter out to our mailing list right that we kind of that is our subscriptions and for this we have some kind of API that we're going to use you can say that this is Agent five but it's basically a simple script that gets called by The Mastermind and we get like a confirmation that this is sent to our

(03:45) mailing list so yeah I've been testing it out a lot and it seems to work pretty good if you ask me uh the the thing that's worked pretty good is how good these models are at following instructions now and this is kind of what I changed when I tried this in the past because I have struggled making every single step follow the instructions I want but something must have happened the last six months since the last I tried this because now it works superb and I'm producing very good results as you will see soon so yeah I'm

(04:17) very happy with this and I think it's pretty cool to see these kind of results uh but I think we just going to take a look at the code now uh to kind of show you how I set this up and then we're going to do a few tests see what we can bring back and yeah let's just head over to cursor and take a look at our code here okay so let's start by just looking at the individual agents we have here right so I think we're just going to start with the web agent this has a very specific job uh yeah this is a bit Shady

(04:49) but what I wanted to do is to create a tech newsletter and for this we are have to do some scraping so I kind of picked one of the biggest sites I know so we don't yeah so it's going to be the Verge and for this I have set up some X pads here that is kind of very uh targeted uh I could show you what these xats point to just just give me a sec so if you go to the Birch homepage they have this thing called most popular and this is what it points to but uh yeah you could pretty much change this to anything you

(05:25) wanted to right it could be your homepage it could be Google that is basically up to you but in this case uh for my test I pointed to this most popular section here on the verge just for my test cases uh so that is basically how the first uh code works so we can just run it without going into too much detail uh I will put all of this full project up on my uh Community GitHub so if you want to try this out set this up for yourself just become a member of the channel and you will get access to this and you can play around

(05:58) with it so let's just run the the web agent now so we can do this one by one so yeah you can see this is going out and collecting some information for us right so we got this uh text file here called content and this is kind of the unstructured data we got so it's 800 something lines here with uh it is the stories from the links but it's pretty much unstructured so it's not so usable right and that is kind of where the um the second agent comes into play and that is the rewrite agent here you can

(06:36) see we are running GPD 40 mini and this has the task of create a cohesive set of short captivating stories from the news article right include image URL article URLs because we want to be able to link back to the verge and begin with a very brief attention grabbing introduction so yeah I'm just doing some prompt here right we have an F string because we want to feed uh the content we just created so that is content.

(07:08) text into this prompt and from that we can just generate the stories. text right and if we go down here we can take a look at this uh agent too so python uh what did I call it newsletter agent no it's not that it's rewrite agent right so if we run this now this is going to look at the content. text and yeah create stories based on our prompt and again this is very flexible you can turn this into whatever you want your newsletter to be about right and I try to avoid asterics and stuff because this sounds weird in our voice right

(07:48) stories have been written to stories okay so here is our stories so you can see we have the links the images and we have kind of just a short story or notes here about each case uh I think it turned out pretty good and but I'm not going to you can see Marcus brownley influential YouTuber known as mkd unve this world paper yeah I'm not going to go too much into this uh but that is basically our second agent and then we have the third agent that is going to be the The Voice agent right so if you go into this you can see we are using 11

(08:27) Labs here and we have some asers storage because we want to send the MP3 file so we can host it so people can load it from there so we have our prompt here so please rewrite the following stories for a short conversation we're just going to feed in the stories uh this is also a pretty basic setup 11 Labs we have our voices you can pick whatever voice you want right and we can try this too so if we run this now uh we will get a AER URL link here where we are actually hosting our MP3 file that contains the news Voice

(09:08) version of our stories so here we got the link uh I can just uh take this down and we can have a quick listen right okay so you can see uh we got the MP3 file here so we can just play Let's Play 10 seconds of this now so this is what's going to the link in the newslet is going to point to this in today's Tech Roundup we cover the latest happenings in technology gaming and legal news first up YouTube Star Maris brownley known as MKBHD yeah you you get the point it's just a voice version if people want to

(09:41) listen to that instead of reading right uh I think that's pretty popular now amongst articles online right uh yeah that was the voice agent and the third fourth agent is the newsletter agent and this is going to rewrite um our newsletter into a following structure that is going to have a template uh so this is the HTML template we are going to use and you can see it's going to use this template to actually rewrite uh use the newslet template as the template and we're going to feed the stories generate a newsl content in HTML

(10:25) format and kind of follow the template we have and this has work pretty good for me we we're using some regx here to kind of get it in the format we want uh but it has been working pretty good and this is going to be uh we're going to write this HTML template into our send newsletter agent.

(10:47) JS here so you can see here here is kind of the HTML code that we're going to send out as a newsletter right so yeah that is basically it uh for our agents uh I think you understand better when we run it uh but if we take a look at our main now this is kind of what controls the whole the full system here so we have a function called run agent so this is going to have a command and an agent name uh we're going to have a function called analyze content because we want to yeah basically analyze the content so what made this a bit of a challenge is to

(11:25) actually use the o1 mini and o1 preview since we since we don't have a system prompt so we kind of have to feed uh our prompts into the prompt uh it doesn't matter too much but it would make it a bit easier if we had uh system prompts but you can see here this is the prompt that is going to be fed into our first uh what do you call it like analysis so this is going to analyze the context coming from the web agent I think so what I found out is when using the o One models we have to kind of feed in the date and the time because

(12:04) sometimes I think the news is in the future and it regards this as fake news because it's in the future so what I have to do is feed in the time and the date and that seem to work pretty good right okay so the kind of way I set up this decision uh based on analysis is kind of we have in our prompt here so we're going to take in the content the agent Ro right we're going to address it for issues related to Quality engagement identify issues suggest improvements and if it's uh suitable confirm that it

(12:38) meets its standards and you can see we have this if suggest improvements is in a string in their response or issues then we're going to skip this if not and we're going to try to reanalyze the content reanalyze reanal reanalyze wow that's difficult the content right and if there's no issues we just going to skip to the next agent right uh so that is basically how I set up this the whole way we're going to rewrite the Articles again we have this suggest enhancement or weaknesses in the story uh we're going to try to fix it if

(13:17) not we're just going to skip the the next agent and that is basically the logic for all of my yeah for all of the A1 instructions here or o1 instructions so it's not that complex but uh for me it's been working pretty good and the results have been yeah uh up to a good standard I think and like I said if you want to dive deep into this you can find a code on the community GitHub uh but I think we just going to do a few tests now and then you kind of see in prog in um real time how it works instead of me

(13:51) trying to explain it so let's just set this up and let's do a few runs and then we can go to my email and kind of see how the newsletter looks in its final form so yeah let's just open up the terminal and let's run this now uh like a full cycle and we're going to try to see what uh our Mastermind suggest here so you can see we are importing a time right and it's running the web agent now so this is going to do the scraping from the words so here we can see now so it kind of this is a response from our

(14:26) prompt it kind of went through every single Source right so so the First Source was Marcus Brown Lee and kind of the Second Source was the mobile phone thing and at the bottom here there are some overall recommendations remove redundant system generated text and yeah we basically have some suggestions to improve this right and uh it's going to adjust content based on analysis so here we did a reanalysis right you can see content reanalysis and now overall uh looks better I think so we have relevance quality

(15:03) engagement and at the bottom now you can see the conclusion is all five articles are suitable for inclusion in the tech newsletter offering valuable insights into yeah you can see so that worked and now we kind of skipped onto our next agent that is going to be the the rewrite agent and looks like it was kind of happy with that on the first attempt here that was kind of surprising right so if if we go here so the stories analysis and it goes through every single story here and it comes up with like a conclusion uh the newslet accesss

(15:40) delivering clear engaging well flowing content across various topics okay that was pretty good usually it does this two times the newsletter is well written and largely ready for publication okay that was pretty good and the voice segment was generated successfully now we are running a newslet agent okay so we are still waiting for the newsletter uh yeah newsletter was sent that means that this was approved too so you can see newsletter analysis overall quality the sign and it kind of goes through and critiques every single step

(16:14) here and it says ready for publication and it was sent and this kind of included uh I have one email on my mailing list so I'm going to go to my Gmail now and let's see if we got uh the newsletter okay perfect so we can see we got the email if we go down to the latest test now Tech insights you can't miss so we embedded the image that's look that looks great hello Tech Enthusiast dive into the today's newsletter so this is a small intro I like that today's highlights Marcus brownley wallpaper app

(16:51) CA the Privacy store am the Boost digital artist okay experts Wars against digital IDs law enforcement Sony state of play good uh here's the link to the latest news so we can listen now uh you can see we got the MP3 file here great we have some additional uh insights so here we get the links right so we can open up the link and here we kind of pay back to the words right since we send them to these links right so that work pretty good we got the UN Subscribe Feature you can see there are some HTML here that we should have probably

(17:28) cleaned up uh but I think it looks pretty cool and I'm super happy with how easy it is to generate this uh just running one script and everything is fully automated by using this uh I would call them simple agents they are not perfect but they do their job and the big Mastermind kind of orchestrates everything right calls up on the right script at the right time and stuff right so yeah just to wrap this up uh it's been a while since I actually tried to create some using agents and I was pleasantly surprised I

(18:02) was super happy how they kind of follow along with my uh kind of detailed instructions and uh I was super impressed by the template because that's needs to be very precise uh if you're going to get that HTML to work right so uh my conclusion is this is going uh somewhere and I think now we can kind of start to create this uh I think for now we can do this simple task as this it's not too to complex you can even say it's not really agentic uh but we have some kind of start here I think we can start

(18:36) to work more on when we kind of get into better models quicker models cheaper models this is not so cheap to run uh but we can easily swap out U the 01 model with GPD 40 and we save a lot of money and I I tested it and the results are pretty okay uh it's a bit uh it fails sometimes but with the o1 it almost never fails so my conclusion is I think we are starting to see now that some uh task can start being automated by using these agent and I'm going to keep exploring more and again if you want access just become a member of the

(19:16) channel some other things I want to show you too I have some services on my homepage so you can do like a shout out in one of my YouTube videos if you have a project you want to share with people uh if you just want to talk to me about something just go to my web page you found the link in the description and here I have kind of my newsletter too and yeah some of my videos so go check that out uh other than that yeah thank you for tuning in have a great day hope you learn something and I'll See You Again shortly