This page is a feed of all my #ai posts in reverse chronological order. You can subscribe to this feed in your favourite feed reader through the icon above. You can also get a weekly digest of all of my posts via email by subscribing here:
I can't imagine I'm the first to try this, but new hobby acquired:
I ran the ones below on the spot and it was quite fun. Before this, whenever I visited the British Museum (a few times a year), I didn't really give most of those statues a second glance.
An exercise for the reader (this one's interesting because they put a reference of what it could have look like if it were complete based on a different statue):
And another bust of good old Caesar (might be interesting as there's so much reference material, and it's so broken):
Try it and have fun! I'll try another batch the next time I go.
In the past, I touched on how I think that most of the ways that original creators make money is broken. This can be anything from the authors of academic journal papers, to YouTubers, to painters, to bloggers. For the purpose of this post, I will call these creators producers and the people who consume their art (I use the word art in the most general sense) consumers. After my previous post, I spoke to some friends about Google's new Bard (the inevitable assistant-in-search) and had some additional thoughts that I wanted to share.
On the internet, making money as a producer of "free" art is often connected to advertising in some way. I think it's uncontroversial that it's a "hack" to make money off of advertising. The logic goes that as a creator, you've captured attention, and now you can sell that attention to other creators that require it. I feel like this pollutes the original creation and that art should exist on its own. I don't want some pages in my novel to have ads, or the walk in a gallery to have ads, or indeed to be pulled out of the immersion of a TV show to see ads. Consumers and producers of content would generally agree that ads within their work is undesirable.
We get to the crux of the issue: how else can producers make money? I think there are 3 general ways:
https://news.ycombinator.com/item?id=34690401
https://news.ycombinator.com/item?id=34681820
For a long time I've been interested in the idea of creating a digital twin of yourself. I've tried this in the past with prompt completion trained on many years of my chat data, but it was always just a bit too inaccurate and hollow.
I also take a lot of notes, and have been taking more and more recently (a subset of these are public, like this post you're reading right now). I mentioned recently that I really think that prompt completion on top of embeddings is going to be a game-changer here.
You probably already know about prompt completion (you give it some text and it continues it like auto-complete on steroids) which underpins GPT-3, ChatGPT, etc. However, it turns out that a lot of people aren't familiar with embeddings. In a nutshell, you can turn blocks of text into high-dimensional vectors. You can then do interesting things in this vector space, for example find the distance between two vectors to reason about their similarity. CohereAI wrote an ELI5 thread about embeddings if you want to learn more.
None of this is particularly new -- you might remember StyleGAN some years ago which is what first really made this concept of a latent space really click for me, because it's so visual. You could generate a random vector that can get decoded to a random face or other random things, and you could "morph" between faces in this space by moving in this high-dimensional space. You could also find "directions" in this space (think PCA), to e.g. make a slider that increases your age when you move in that direction, while keeping other features relatively unchanging, or you could find the "femininity" direction and make someone masculine look more feminine, or a "smiling amount" direction, etc.
The equivalent of embedding text into a latent space is like when you have an image and you want to hill-climb to find a vector that generates the closest possible image to that (that you can then manipulate). I experimented with this using my profile picture (this was in August 2021, things have gotten much better since!):
Today, I discovered two new projects in this space. The first was specifically for using embeddings for search which is not that interesting but, to be fair, is what it's for. In the comments of that project on HackerNews, the second project was posted by its creator which goes a step further and puts a chat interface on top of the search, which is the exact approach I talked about before and think has a lot of potential!
Soon, I would like to be able to have a conversation with myself to organise my thoughts and maybe even engage in some self-therapy. If the conversational part of the pipeline was also fine-tuned on personal data, this could be the true starting point to creating digital twins that replace us and even outlive us!
Some weeks ago I built the "Muslim ChatGPT". From user feedback, I very quickly realised that this is one use case that absolutely won't work with generative AI. Thinking about it some more, I came to a soft conclusion that at the moment there are a set of use cases that are overall not well suited.
There's a class of computational problems with NP complexity. What this means is not important except that these are hard to solve but easy to verify. For example, it's hard to solve a Sudoku puzzle, but easy to check that it's correct.
Similarly, I think that there's a space of GPT use cases where the results can be verified with variable difficulty, and where having correct results is of variable importance. Here's an attempt to illustrate what some of these could be:
The top right here (high difficulty to verify, but important that the results are correct) is a "danger zone", and also where deen.ai lives. I think that as large language models become more reliable, the risks will be mitigated somewhat, but in general not enough, as they can still be confidently wrong.
In the bottom, the use cases are much less risky, because you can easily check them, but the product might still be pretty useless if the answers are consistently wrong. For example, we know that ChatGPT still tends to be pretty bad at maths and things that require multiple steps of thought, but crucially: we can tell.
The top left is kind of a weird area. I can't really think of use cases where the results are difficult to verify, but also you don't really care if they're super correct or not. The closest use case I could think of was just doing some exploratory research about a field you know nothing about, to make different parts of it more concrete, such that you can then go and google the right words to find out more from sources with high verifiability.
I think most viable use cases today live in the bottom and towards the left, but the most exciting use cases live in the top right.
Another important spectrum is when your use case relies on more on recall versus synthesis. Asking for the capital of France is recall, while generating a poem is synthesis. Generating a poem using the names of all cities in France is somewhere in between.
At the moment, LLMs are clearly better at synthesis than recall, and it makes sense when you consider how they work. Indeed, most of the downfalls come from when they're a bit too loose with making stuff up.
Personally, I think that recall use cases are very under-explored at the moment, and have a lot of potential. This contrast is painted quite well when comparing two recent posts on HN. The first is about someone who trained nanoGPT on their personal journal here and the output was not great. Similarly, Amarbot used GPT-J fine-tuning and the results were also hit and miss.
The second uses GPT-3 Embeddings for searching a knowledge base, combined with completion to have a conversational interface with it here. This is brilliant! It solves the issues around needing the results to be as correct as possible, while still assisting you with them (e.g. if you wanted to ask for the nearest restaurants, they better actually exist)!
Somebody in the comments linked gpt_index so you can do this yourself, and I really think that this kind of architecture is the real magic dust that will revolutionise both search and discovery, and give search engines a run for their money.
Recently, people whose work I admire made me have to confront the "art not artist" dilemma once more. In this case, Nick Bostrom with racism, and Justin Roiland with domestic abuse.
Thinking about it, more generally, I guess it comes down to:
However, it makes me think about the question: what if an AI were to be in a similar situation? Done something good and also done something bad. The current vibe seems to be that AI is a "tool" and "guns don't kill people, people kill people". But once you assign agency to AI, it starts opening up unexplored questions I think.
For example, what if you clone an AI state, one goes on to kill, the other goes on to save lives, in what way is the other liable? It's a bit like the entanglement experiment that won the 2022 Nobel physics prize -- you're entangling across space (two forks of a mind) vs time (old "good" version of a celebrity vs new "bad" version of a celebrity) where all versions are equally capable of bad in theory. To what extent are versions of people connected, and their potential?
It also reminds me of the sci-fi story Accelerando by Charles Stross (which I recommend, and you can read online for free here) where different forks of humans can be liable for debts incurred by their forks.
On a related note, I was recently reading a section in Existential Physics by Sabine Hossenfelder titled "Free Will and Morals". Forgive the awful photos, but give it a read:
So it doesn't even have to be AI. If someone is criminally insane, they are no longer agents responsible for their own actions, but rather chaotic systems to be managed, just like you don't "blame" the weather for being bad, or a small child for making mistakes.
Then, what if in a sufficiently advanced society we could simply alter our memories or reprogram criminal intent away? Are we killing the undesirable version? The main reasons for punishment are retribution, incapacitation, deterrence, and rehabilitation, but is there research out there that has really thought about how this applies to AI?
There's a fifth reason that applies only to AI: Roko's Basilisk (warning: infohazard) but it's all connected, as I wonder what majority beliefs we hold today that future cultures will find morally reprehensible. It might be things like consuming animals or the treatment of non-human intelligence that is equivalent to or greater than humans by some metric. At least we can say that racism and domestic violence are pretty obviously bad though.
Great article on some ways to interact with ChatGPT: https://oneusefulthing.substack.com/p/how-to-use-chatgpt-to-boost-your. I find it funny that so many people speak to ChatGPT politely (I do too). I wonder if post-singularity we'll be looked upon more favourably than the impolite humans.
Last weekend I built a small AI product: https://deen.ai. Over the course of the week I've been gathering feedback from friends and family (Muslim and non-Muslim). In the process I learned a bunch and made things that will be quite useful for future projects too. More info here!
Not too long ago I mentioned that the search engines will need to add ChatGPT-like functionality in order to stay relevant, that there's already a browser extension that does this for Google, and that Google has declared code red. Right on schedule, yesterday Microsoft announced that they're adding ChatGPT to Bing. (If you're not aware, Microsoft is a 10-figure investor in OpenAI, and OpenAI has granted an exclusive license to Microsoft, but let's not get into how "open" OpenAI is).
I heard about this via this HackerNews post and someone in the comments (can't find it now) was saying that this will kill original content as we know it because traffic won't go to people's websites anymore. After all, why click through to websites, all with different UIs and trackers and ads, when the chat bot can just give you the answers you're looking for as it's already scraped all that content. To be honest, if this were the case, I'm not so sure if it's such a bad thing. Let me explain!
First of all, have you seen the first page of Google these days? It's all listicles, content marketing, and SEO hacks. I was not surprised to hear that more and more people use TikTok as a search engine. I personally add "site:reddit.com" to my searches when I'm trying to compare products for example, to try and get some kind of real human opinions, but even that might not be viable soon. You just can't easily find what you need anymore these days without wading through ads and spam.
Monetising content through ads never really seemed like the correct approach to me (and I'm not just saying that as a consistent user of extensions that block ads and skip sponsored segments in YouTube videos). It reminds me a lot of The Fable of the Dragon-Tyrant. I recommend reading it as it's a useful metaphor, and here's why it reminds me (skip the rest of this paragraph if you don't want spoilers): there's a dragon that needs to be fed humans or it would kill everyone. Entire industries spring up around the efficient feeding of the dragon. When humans finally figured out how to kill it, there was huge resistance, as among other things, "[t]he dragon-administration provided many jobs that would be lost if the dragon was slaughtered".
I feel like content creators should not have to rely on ads in the first place in order to be able to create that content. I couldn't tell you what the ideal model is, but I really prefer the Patreon kind of model, which goes back to the ancient world through art patronage. While this doesn't make as much money as ads, I feel like there will come a point where creating content and expressing yourself is so much easier/cheaper/faster than it is today, that you won't have high costs to maintain it on average (just look at TikTok). From the other side, I feel like discovery will become so smooth and accurate, that all you need to do is create something genuinely in demand and it will be discovered on its own, without trying to employ growth hacks and shouting louder than others. I think this will have the effect that attention will not be such a fiery commodity. People will create art primarily for the sake of art, and not to make money. Companies will create good products, rather than try to market worthless cruft. At least that's my ideal world.
So how does ChatGPT as a search engine affect this? I would say that this should not affect any kinds of social communication. I don't just mean social media, but also a large subset of blogs and similar. I think people will continue to want to follow other people, even the Twitter influencer that posts business tips, rather than ask ChatGPT "give me the top 5 business tips". I believe this for one important reason: search and discovery are two different things. With search, there is intent: I know what I don't know, and I'm trying to find out. With discovery, there isn't: I don't know what I don't know, but I loiter in places where things I would find interesting might appear, and stumble upon them by chance.
Then there's the big question of having a "knowledge engine" skipping the sources. Let's ignore the problem of inaccurate information[1] for now. I would say that disseminating knowledge at the moment is an unsolved problem, even through peer-reviewed, scientific journal papers and conference proceedings (this is a whole different topic that I might write about some day, but I don't think it's a controversial view that peer-review and scientific publishing is very, very broken).
I do not believe that the inability to trace the source of a certain bit of knowledge is necessarily the problem. I also don't believe that it's necessarily impossible, but lets pretend that it is. It would be very silly I think to cite ChatGPT for some fact. I would bet that you could actually get a list of references to any argument you like ("Hey ChatGPT, give me 10 journal citations that climate change is not man-made").
I think the biggest use cases of ChatGPT will be to search for narrowly defined information ("what is the ffmpeg
command to scale a video to 16:9?") and discover information and vocabulary on topics that you know little about in order to get a broad overview of a certain landscape.
However, I don't see ChatGPT-powered search killing informative articles written by humans. I see AI-generated articles killing articles generated by humans. "Killing" in the sense that they will be very difficult to find. And hey, if ChatGPT could actually do serious research, making novel contributions to the state-of-the-art, while citing prior work, then why shouldn't that work be of equal or greater value to the human equivalent?
In the case of AI-generated garbage drowning out good human articles just by sheer quantity though, what's the solution? I think there are a number of things that would help:
Overall I think that ChatGPT as the default means of finding information is a net positive thing and may kill business models that were flawed from the start, making way for something better.
I've had this problem with normal Google before (the information cards that try to answer your questions). For a long time (even after I reported it), if you searched something like "webrtc connection limit", you would get the wrong answer. Google got this answer from a StackOverflow answer that was a complete guess as far as I could tell. Fortunately, the person who asked the question eventually marked my answer as the correct one (it already had 3x more upvotes than the wrong one) although the new answer never showed up in a Google search card as far as I can tell. ↩︎
I finally wrote an article on my thoughts about ChatGPT after a lot of repeated questions/answers from/to people: https://yousefamar.com/memo/articles/ai/chatgpt/
This is one of those things where I'm not sure it should really be an "article" but instead something more akin to a living document that I update continuously, maybe with a chronological log included. At the same time, a lot of the content is temporally bound and will probably lose relevance quite fast. Something to figure out in the future!
Amarbot was using GPT-J (fine-tuned on my chat history) in order to talk like me. It's not easy to do this if you follow the instructions in the main repo, plus you need a beefy GPU. I managed to do my training in the cloud for quite cheap using Forefront. I had a few issues (some billing-related, some privacy-related) but it seems to be a small startup, and the founder himself helped me resolve these issues on Discord. As far as I could see, this was the cheapest and easiest way out there to train GPT-J models.
Unfortunately, they're shutting down.
As of today, their APIs are still running, but the founder says they're winding down as soon as they send all customers their requested checkpoints (still waiting for mine). This means Amarbot might not have AI responses for a while soon, until I find a different way to run the model.
As for fine-tuning, there no longer seems to be an easy way to do this (unless Forefront open sources their code, which they might, but even then someone has to host it). maybe#6742 on Discord has made a colab notebook that fine-tunes GPT-J in 8-bit and kindly sent it to me.
I've always thought that serverless GPUs would be the holy grail of the whole microservices paradigm, and it might be close, but hopefully that would make fine-tuning easy and accessible again.
/u/dismantlemars created a colab to run OpenAI's new Point-E model that you can use here. My first few experiments were interesting though not very usable yet! Supposedly it's thousands of times faster than DreamFusion though (the most well known crack at this). It took me about 30 secs to generate models, and converting the point cloud to a mesh was instant.
I tried to first turn my profile picture into 3D, which came out all Cronenberg'd. To be fair, the example images are all really clean renderings of 3D models, rather than a headshot of a human.
Then I tried the text prompt "a pink unicorn" which came out as an uninteresting pink blob vaguely in the shape of a rocking horse. Simply "unicorn" looked a bit more like a little dinosaur.
And finally, "horse" looked like a goat-like horse in the end.
The repo does say that the text to point cloud model, compared to the image to point cloud model is "small, worse quality [...]. This model's capabilities are limited, but it does understand some simple categories and colors."
I still find it very exciting that this is even possible in the first place. Probably less than a year ago, I spoke to the anything.world team, and truly AI-generated models seemed so far out of reach. Now I feel like it won't be much longer before we can populate entire virtual worlds just by speaking!
On a related note, I recommend that you join the Luma waitlist for an API over DreamFusion.
There are APIs out there for translating natural language to actions that a machine can take. An example from wit.ai is the IoT thermostat use case.
But why not instead use GPT-3? It ought to be quite good at this. And as I suspected, the results were quite good! The green highlighted text is AI-generated (so were the closing braces, but for some reason it didn't highlight those).
I think there's a lot here that can be expanded! E.g. you could define a schema beforehand rather than just give it some examples like I have, but I quite like this test-driven approach of defining what I actually want.
I did some tweaks to teach it that I want it to put words in my mouth as it were. It invented a new intent that I hadn't defined, so it would probably be useful to define an array of valid intents at the top. It did however manage to sweet-talk my "wife"!
I think this could work quite well in conjunction with other "modules", e.g. a prompt that takes a recipient
, and a list of people I know (and what their relationship is to me), and outputs a phone number for example.
Amazon's creating AI-generated animated bedtime stories (story arc, images, and accompanying music) with customisable setting, tone, characters, and score. I believe that procedurally generated virtual worlds will be one of the prime use cases for these large models, and this is one example that I expect to see more of!
I think the most difficult part here will be to craft truly compelling and engaging stories, though this is probably soon to be solved. My brother and I attempted a similar project (AI-generated children's books) and the quality overall was not good enough at the time, but at the speed these things move I expect that to be a thing of the past in a matter of months!
Seems like GPT-4 is just around the corner! I'm really looking forward to it and not just the improvement on GPT-3, but the multi-modal inputs. I really think GPT-4 and models like it will be central to our future.
Nvidia's new diffusion model is really pushing the envelope. A lot of exciting capabilities!
I'm certain the market for GPT3-based spreadsheet plugins/add-ons is ripe for sales much more than libraries that target developers like cerebrate.ai. I've seen a general-purpose add-on for Google Sheets here, but I think that crafting these prompts to do specific things and wrapping these in higher-level functions has much more potential.
More Stable Diffusion resource links: https://rentry.org/sdupdates2