Creating Automations with AI
A Freelancer’s Guide

Ah, this feels nostalgic. Some of you might remember that the very first article I wrote all the way back in 2017 — more than eight years ago! — was on this exact topic: creating macros using a neat little tool called AutoHotkey. Subtitlers quite liked my scribbles and followed them step by step to create simple automations, which made their work a little bit easier.

Fast-forward to now. Things have changed a lot: people have grown more tech-savvy, the software has improved in many ways, and, of course, gen AI has taken the world by storm. So, I’ve decided it’s time to revisit the subject and show you what’s possible in 2025.

In this article, I will teach you to create robust, powerful automations with ease — no programming needed.

Alright, strap your seatbelts. We’re going in.

Part 1: Bookmarklets

One thing I’ve never liked about AutoHotkey is how you need to know a button’s position on the screen or the exact way it looks to be able to click it. Move your browser window, change the monitor resolution, zoom in or out, or simply share your script with someone else — and the macro will likely no longer work, unless you sprinkle it with some highly advanced code (which even I might struggle with).

However, I’ve discovered a more robust — and simpler! — way to interact with webpage elements that doesn’t involve getting their coordinates or screenshots, or using AutoHotkey altogether — so-called bookmarklets.

What is a bookmarklet? Well, basically it’s like a normal browser bookmark, but instead of a webpage address, in the URL field you add special code that will run when you click it:

Bookmarklets can do some cool stuff:

Open new tabs
Interact with web elements: click on buttons, populate text fields, check boxes, scroll through pages, etc.
Change UI style and appearance: height, width, font size, color, transparency, and so on.
Perform multiple actions in succession
And more

Let me give you a few examples. This bookmarklet opens a new tab with Wikipedia.org:

This one writes “George Washington” in Wikipedia’s search field:

This one clicks on the “Submit” button on the same page:

And this one finds the opened article’s main title and increases its font size by 50%:

If you’re curious to try these and use Google Chrome, just create four bookmarks, copy the codes below, paste them in the URL fields one by one, and click through them in their respective order.

Creating a bookmarklet

So, how do we create these things without knowing the first thing about JavaScript or programming? — Well... we just ask an AI chatbot to do that for us!

Head over to gemini.google.com and register an account in a few clicks
Open a new chat and select the latest model
Ask it to create a bookmarklet you want
Wait for the code to generate
Copy it to your bookmark
Done!

Something like this:

Simple, right? And awfully wrong — this won’t work at all, because the chatbot has no way of knowing what you mean by “search field”. So, we need to tell it, and for that we will use so-called selectors.

HTML 101

First, a small detour.

(This part will be a bit technical, but please bear with me — it will all make sense in the end.)

Let’s start with the basics here. Each webpage on the internet is actually a piece of code which describes how different visual elements will be styled and positioned once you load that webpage in your browser. For instance, this simple structure...

...is produced by the following code, written in a programming language called HTML:

The different web elements are defined using special <tags>. For instance:

And so on. There are hundreds of different tags which often come in pairs, like this: <p> Text </p>

Elements can also have attributes — such as style, type, source, class and ID — contained within the tag. Each attribute provides additional info about the element via its value enclosed in quote marks:

Furthermore, HTML elements can be nested inside each other. For example, within a “parent” div container, you can have two other “child” divs, each with two buttons in them. So, the code for the following element...

...would look something like this:

Selectors

So far so good? Okay, great. Now, we move on to so-called selectors, the things that we need for our chatbot prompts. What is a selector? Put simply, it’s a special pattern that points to a particular web element on a page, something that uniquely identifies it — perhaps its color, or a word it contains, or its class, or something like that. If you give your chatbot the right selector, it will know exactly which element you have in mind.

Now, let’s go back to the code block above. What pattern can you see for the first button element? What’s different about it compared to the other buttons?

...

Yes, it’s the only one with the text “Button 1” and the first one within its left-section div.

Here’s another example:

The second input element — what’s special about it? That’s right, it has an id attribute equal to last name. No other element has the same id.

Let’s now return to Wikipedia and try to find a good selector for our search field in the center. What we’ll do is right-click on it and go to “Inspect”:

This will show you the field’s code in a console:

Yeah, compared to our simple examples, real code looks quite scary. But fret not! — it’s the same principles. Our search field is the highlighted input element. Can you figure out what’s unique about it?

...

If you use Ctrl+F in the console to check for matches, you will find that, among other things, the input’s id has no duplicates on the page. Nice!

Now, all that’s left to do is ask our chatbot in a way it understands:

Bam! — grab the code and you’re good to go.

Exercises

Alright, enough theory — time to practice! Follow the procedure mentioned above:

Inspect an element
Find something unique about its structure or attributes
Describe what you want to achieve in a chatbot prompt
Get the code, create a bookmarklet, and see if it works!

(You can skip this part for now and come back later.)

Change text color to purple

Check

Hey!

Toggle

Press

Some tips for getting good selectors:

In the Inspector console, click on the right-arrow (▸) in front of an element to see its children.
Make sure to pick the right element for your selector — and not its parent or sibling.
Use Ctrl+F in Inspector to type a value and see if it’s indeed unique (i.e. has no matches).
Use partial values: “a button whose class contains "red-button" ”
Combine descriptions: “a button whose id="login" and label="primary" ”
Exclude unwanted parts: “a button whose name doesn’t start with "submit" ”
Describe ancestry: “the second button inside a div whose id equals "password-box" ”
Attributes and their values can sometimes change each time you load a webpage or interact with an element on it. Focus on stable, unchanging parts and avoid arbitrary, volatile ones (which usually look generic or random, e.g. class="a2HcBX4Kn"). A good way to check is to reload the page and see if your bookmarklet still works.
Ask the bot to not add comments to the code, as they might not be accepted by your browser.
If the code doesn’t work, explain the situation to the bot and ask for alternative solutions.

Part 2: Extensions

As cool as bookmarklets are, they come with some big limitations:

Some websites and browsers prohibit their use
They occupy space on the bookmarks bar, which is limited
They can’t emulate keyboard events like pressing a shortcut
Any visual change you make to a webpage will disappear upon a reload
Particularly complex bookmarklets will often fail to work for various reasons

Overall, bookmarklets can be fun to tinker with, but they aren’t very useful in practice.

So, why did we go through all the trouble of learning how to make them? — Haha, well, it was a ruse! My real goal was to teach you finding and testing selectors, because that’s precisely what you’ll need for our next step — creating your own browser extensions!

Creating new shortcuts

Yep, you read that right! Custom, personalized Chrome extensions for our own individual needs. They’re a lot more powerful than bookmarklets and can do a whole bunch of different things:

Perform complex page interactions with clicking, scrolling, toggling, etc.
Change webpage elements’ appearance in a persistent way (via cookies)
Add new UI elements: buttons, menus, overlays, popups
Emulate mouse and keyboard events
Add right-click context menu options
Perform scheduled tasks
React to your actions
And much more

Those of you without much programming knowledge might be thinking right now, “Oh hell naw!” But don’t worry — it’s surprisingly easy. All you need to do is

Prompt your chatbot to create an extension that you want
Copy the generated code into text files
Load the files into your browser
Voila! Job complete!

Pretty straightforward, and the real magic lies in the prompting part — it’s where your imagination and analytical thinking really come through.

Okay, for starters, we can create a simple extension. How about one that does what our bookmarklets did, but via shortcuts?

Step #1: Add the following prompt in your AI chat:

Step #2: Wait for the code to generate and then copy it to the respective text files (your chatbot will tell you which ones). The files must be in one folder and have the correct file extensions (not .txt). If those are hidden, make sure to unhide them.

Step #3:

Type chrome://extensions in your browser’s URL bar and press enter
In the upper-right corner, toggle the Developer Mode
Click “Load Unpacked”, navigate to your folder and click OK

And that’s it. You’ve got yourself an extension. For best results, use the most powerful LLM you have access to, such as Gemini 2.5 Pro, ChatGPT o3, or newer. Small models and the GPT-4x series will give poor results.

Producing action sequences

Instead of using individual shortcuts, we can execute multiple actions in one go. Here’s a prompt example:

Modifying user interface

Now let’s step it up a notch. This time, instead of shortcuts, I’ll create four buttons that change the visual style of the web elements on Wikipedia:

Not very useful, I know, but just to give you an idea of what’s possible.

Adding functionality

Okay, how about a quick synonym lookup?

Here’s what you’ll see upon right-clicking on a selected word:

The free publicly available APIs aren’t very good unfortunately, but you could supercharge your synonyms by using pay-per-token ones, such as Gemini API or Open AI API, which are much more powerful but not free.

Making Extensions: Tips & Tricks

These might look like AI slop, but no, I wrote them myself 😆

General:
- Be concrete:
  - It’s best to describe the desired structure/functionality/design with a good level of detail, so that the bot doesn’t have to guess what you had in mind. Write in a way that leaves little room for misinterpretation.
  - At the same time, do not overdo it — bots can understand only so many instructions before the output quality begins to deteriorate.
- Add visual references:
  - Attach a screenshot of what you’d like your extension to look like.
  - Draw a rough paper sketch of your idea and upload it to the prompt.
- Iterate:
  - Creating extensions via AI is very much trial and error.
  - If the output is close to ideal but you would like to change a few things, tell the bot which elements should be modified and in what way. Continue iterating until you’re satisfied with the result.
- Troubleshoot:
  - Test your selectors via bookmarklets.
  - If the extension doesn’t load or doesn’t work correctly, explain to the bot what kind of issue/error you’re getting — and ask to fix. If that goes nowhere, start over in a new chat or use a different AI model.

Shortcut scope:
- You can switch between “Global” and “In Chrome” via chrome://extensions/shortcuts

Advanced:
- Read the thoughts: if you’re using a “thinking” model (which you should), taking a look at the chatbot’s thought process can give you a hint at what it might have misunderstood.
- Go meta:
  - Give the bot a brief summary of what you want and let it brainstorm for you: figure out the best setup, the plan, the pitfalls, the alternatives, and so on.
  - Ask it about what would be the optimal phrasing for your prompt — what to mention, what to omit, how to structure your prompt, etc.
  - If you get a good result after some troubleshooting, ask the bot what you should add to your future prompts to avoid this problem going forward.

Security:
- Never download other people’s add-on files directly, as they can contain malicious code. If you want to try someone’s extension, ask them for a link to their chatbot conversation, make sure the prompts don’t include anything sketchy, and copy them to your own chat.
- Never add any proprietary or sensitive information to your prompts.
- Never create extensions that will break a contract or NDA you’re under.
- Extensions that utilize your private API keys (e.g. ChatGPT-powered AI research assistant or synonym lookup) can be very risky, as your keys may get leaked. Avoid posting/sharing such extensions, or even creating them altogether, unless you know what you’re doing.

Limitations:
- You can include up to four shortcuts in one extension (if you need more, just create multiple extensions or use popups with buttons instead).
- The longer your conversation gets, the dumber the AI becomes. Keep this in mind when iterating and bugfixing.
- Some chatbots aren’t good at iterating and/or troubleshooting (e.g. Claude Sonnet 3.7); instead of changing one segment in the code, they replace big parts or the whole thing.

Alright, this is it for now. Initially I wanted to include Part 3 about creating your own desktop apps with AI, but then I realized it would require a much deeper dive and even some programming, so maybe next time! Anyways, thanks for making it all the way to the end — I hope you found this article insightful! Now go and create some extensions 😁

As always, feel free to share your questions, thoughts and remarks in a comment below.

Until next time!

Creating Automations with AI A Freelancer’s Guide