Welcome to Zettabytes, a new Substack newsletter on the hypergrowth of AI-fueled data. A shout-out to the many new subscribers who have signed up. Thank you!
In my Substack bio, I describe myself as a tech journalist who writes about data, databases, cloud, infrastructure, data centers, and an Australian Shepherd named Finn. This post is about the data and the dogs.
We got Finn on a farm in Pennsylvania, like many families, during the Covid pandemic. As a puppy, his teeth were sharp and could draw blood when he nipped. Finn is still a bit of a handful, but as long as he gets his exercise, everyone’s happy.
Australian Shepherds, a.k.a. Aussies, are a working breed. The merles are popular on TV with their mixed coats and different colored eyes. Finn is what’s called a tricolor with red, white, and tan. Unlike many Aussies you see in the US, he has a tail.
The Chihuahua bake-off
According to Anthropic’s Claude, there are approximately 900 million to 1 billion dogs around the world. Just over half are pets, the rest are strays, feral, or working dogs. The US has the highest pet dog population at ~ 70 to 90 million.
Dogs range in size from Chihuahuas at the small end, weighing as little as 2 pounds, to Great Danes and Mastiffs at 200 pounds. Finn tips the scale at about 50 pounds. On a good day, I say he’s athletic. But not always.
Speaking of Chihuahuas, you may have seen the dataset of Chihuahuas and blueberries that was used to train machine learning. It’s a well known test, even as a deep learning project on Github.
If you check out the Github link, there are similar image grids with Labradoodles juxtaposed with fried chicken and dogs (not Beagles) that look like bagels.
Data-driven dogs
Dogs and data have much in common. In the tech industry, there’s Datadog, Zynga, and Boston Dynamics’ Spot.
Also, dogs are becoming data generators, according to Claude. Here I’m switching into AI mode:
Military and police K-9 units now wear body cameras and biometric sensors that collect massive amounts of data about their stress levels, heart rate, and behavioral patterns during operations. This data is being used to optimize their training, predict their performance in different scenarios, and even identify early signs of PTSD in working dogs.
Even more interesting is how pet tech companies are using machine learning on dog data. Companies like Whistle and FitBark collect activity data from millions of dogs worldwide, creating datasets that help identify breed-specific health patterns, predict illnesses before symptoms appear, and even detect changes in behavior that might indicate problems at home.
Perhaps most surprisingly, some researchers are using dogs as living sensors for detecting diseases in humans. Dogs can be trained to identify cancer, diabetes episodes, and seizures through scent - essentially turning them into biological data collection devices that can detect molecular changes humans and machines miss.
There's also a growing field called "digital ethology" where researchers use computer vision and AI to analyze vast amounts of video data of dogs to understand their communication patterns, emotional states, and social behaviors in ways that were impossible before big data analytics.
So dogs have gone from being our companions to being both data generators and biological sensors in our increasingly connected world.
A dog’s life
This wouldn’t be a dog blog without something on workplace collaboration. So here’s a picture of Finn (on the right) with his buddies at day care.
What this newsletter is all about
Zettabytes is new. As I said in the day one post, I plan to use AI as a creative, research, and writing assistant, while remaining the human in the loop. It’s an experiment and I welcome feedback.
For more on what Zettabytes is all about, see the Welcome post below.