We began exploring the world of Twitter only recently. The goal was simple - stay informed about the latest news and connect with others who share our interests.

The Problem 

Everything seemed smooth initially. We started following some accounts and began diving into our feed. But it didn't take long before the tweets felt too much to handle. They were coming from every direction, and while some were genuinely insightful, others felt more like noise. People tweet about topics of interest to us, then write about their plans for the evening, their pets, and so on. 

Ads also became a significant distraction, appearing now and then. We understand that Twitter needs to earn, but the ads often felt irrelevant. The platform's suggestions also fell a bit flat for our taste.

Hence, we began considering how to filter our feed to make it more manageable. For example, wouldn't it be great if we could screen only the interesting tweets, filtering out the rest? But how to define the interesting ones?

The Concept 

Despite Twitter's limited flexibility in managing feeds, its API capabilities offer a potential solution. Coupled with GPT-3, we figured we might be able to construct our ideal Twitter feed.

Here's our plan:

Firstly, we'll sift through tweets based on specific keywords.

Next, we'll manually classify the tweets we obtain from the first step to differentiate between what we want to see and what we consider noise.

Lastly, we'll use this classified dataset to finetune GPT-3 and employ the refined model for future tweet classification.

Technology Stack

We needed a few tools to get started. Twitter API was an obvious first step. It's well-documented and relatively straightforward.

Secondly, we decided to go with OpenAI. Several models, ada, babbage, and curie, could work for classifications. Finally, given its simplicity and cost-effectiveness, we chose to start with Ada.

Thirdly, we needed a user interface application for the initial tweet labeling and regular feed management. We opted for Bubble, a web application platform. Our choice was influenced by our experience with Bubble and its compatibility with our needs.

Lastly, we needed a system to manage API calls, and Microsoft Power Automate seemed to fit nicely. What is essential, they do not charge per operation, a critical factor considering the likely number of tweets we planned to process.

With all the tools in place, we began our journey.

Initial Steps 

We swiftly created a Bubble app and linked it to Twitter API, which generously provided us with hundreds of tweets on topics like "ChatGPT," "OpenAI", or "GPT-3".

Tweet Selection 

We aimed for a balanced dataset: 100 tweets we liked and 100 we'd prefer to skip. However, this wasn't straightforward since the tweets we wanted to skip were far more than the ones we liked.

Upon completing our labeling, we noted that our like ratio was 0.22. So basically, for every hundred tweets, we liked twenty-two.

Training 

With the tweet samples in place, we created a file required by OpenAI and ran the training. Now, we were the proud possessors of a personalized OpenAI model. Let's see what it could do.

Initial Results 

We braced ourselves for a round of trials and tweaks as we began. The results from our first attempt, however, pleasantly surprised us.

When we ran the classifications based on our new model and examined the feed generated by AI, it was clear that it was a success. It might not be perfect, but it felt like a solid start.

Here's a glimpse of the output generated by the system as we were drafting this blog post:


The like ratio was 0.26, which aligned closely with what we observed during the labeling process. It suggested that even modest training datasets could be representative.

Costs

As you can see from the screenshot above, processing 1656 tweets costs us merely ten cents. So Ada, indeed, comes across as a cost-effective option.

Final Thoughts

Our exploration into using AI for classifying information has been eye-opening, to say the least. The results we've seen from our personalized Twitter feed have shown us that AI is capable of executing this task and can do so efficiently and effectively.

Importantly, AI doesn't need us to create complex classification rules. All it asks of us is a training dataset. AI handles the rest once we provide examples of the type of information we want and the type we'd prefer to avoid.

And the possibilities are virtually endless. For example, think of incoming emails or service requests – they could all be sorted using this approach, enhancing efficiency and reducing noise in your work.

And this is just the beginning. More to come as we explore and finetune our use of AI in everyday scenarios. Stay tuned for more of our AI adventures. 


(This article was originally posted by us on Medium)