I’m getting specific this week about which pages on your website Google actually sees and why you might not want it to see all of them. We’re also stepping into how AI search tools handle your site differently from Google, and where a newer file type fits into all of it.
Enjoy the episode!
🧰 I love me a good digital marketing tool. This week’s recommended tool is Repurpose.io.
Does just what it says. Upload content and repurpose to other channels and sends it automatically. So many hours to save. More time for creativity, relaxing, or just binge-watching Schitt’s Creek.
🎧 This week’s recommended podcast episode is 5 Signals That Reveal Where Your Podcast Funnel Leaking.Â
This one is for my fellow podcast hosts. The show is Podcasting for Financial Professionals, however even though I’m not that, the host Virginia Elder is a wealth of knowledge.
Find other recommended episodes here.
Other Resource Links
Update Older Content for Better Google Results WEBINAR
Listen on Spotify
tbd
Listen on Apple
tbd
We’re getting in the weeds this week with indexing, getting specific about what pages get seen or not seen by Google, and we’ll step into the AI search engines a bit too.
 I came upon this topic because of some rumblings around how websites may or may not need llms.txt file that was referenced in an AI guide that Google recently released. Let’s first start with what indexing is, and by extension, de-indexing. Google Search traditionally serves up a single link to a webpage when someone searches, right?
You see that list known as search results. Each of those pages are indexed, cataloged by Google. They go out and they crawl the gazillion website pages, and they tuck them away for future reference Deindexing is the act of telling Google, usually by simply checking a box on your website SEO settings, that you don’t want a specific page or category pages indexed by Google.
Why the shit would somebody want that? The most common reason is because of poor user experience when someone lands on the page. If they happen to come across it in search and they see that page first, it might not represent your brand well. Another reason is to conserve crawl budget. The more Google has to crawl your website, the more it has to interpret, and it’s better to spend that pages you do want to show up.
For most of us, does crawl budget really matter? Maybe not, but when doing SEO and doing everything you can to get your website to rank in Google, all the things matter. And these indexed or deindexed pages aren’t the only thing that affect crawl budget, but that’s outside the scope of this episode Tools like ChatGPT and the many others, they don’t index in the traditional Google way.
Some scan traditional results like Bing or Google first, and some use a sort of snapshot from data scraped over time. So the best way to get your website to, quote, index into these additional search platforms is to be listed in the traditional search results. Where does this llms.txt file come into play then?
Google has said, which I take with a grain of salt ’cause things change all the time, that they don’t use it specifically for ranking or indexing, but that it can be helpful for AI agents to better understand your website. Probably nothing of urgency, but could be faster than we realize. If your service is something that an AI agent can do for a human, you’ll want to consider having an llms.txt file on your website
The most popular example is a tangible product like a shirt, flowers, or an airline ticket. If you don’t have e-commerce, an example might be something more transactional like a video editor via Upwork. Now, I don’t expect an AI agent to, quote, “buy me” at my website in the near future, for example.
But I’m still gonna put one on my website because what it could do is provide feedback to the human if the agent is doing research
Oh, and I haven’t even told you what the llms.txt file is. Silly me. Let me elaborate, but with a couple of other files for context. Robots.txt is a traditional file that’s been used for ages that tells web crawlers where they are allowed to go on your website or whether a platform could see your website.
It’s like those automated gate- gated communities where you can get in or out with the gates. Sitemap.xml tells search engines a flat list of what pages exist on your website. Again, this has been used for ages. An llms.txt tells AI models how your content is structured and provides them a map to digest it.
Not any one of these things done or not done is going to make or break whether you rank in Google or get cited in AI, but we strive to do what we can to make it happen, captain!
The Takeaway
Something we didn’t talk about that you can quick check is to see how your pages look in Google. See what pages are being indexed.
The simplest way to do this is type into the Google search bar site:yourdomain.com. Google will return a list of pages it has indexed for your website. If you want assistance how your website shows up in Google or AI searches, a clarity call might be the right next step.
Just you and me and your website.
About This Show
The Get More Website Traffic Podcast covers the strategies, tools, and tactics that help small business owners get more people to their website and turn that traffic into leads. Host Barb Davids breaks it down in plain language every week, with bonus episodes featuring other business owners sharing their expertise on topics that matter to running a small business. Produced by Compass Digital Strategies.