2025
2019
I played with a similar idea during my research fellowship at A*Star Singapore. Where I was trying to make synthetic engine data.
It was generated using an MNIST dataset and GANs(generative Adversarial Networks) and cGANs.
Disclaimer: I was in sophomore year of collage and had no idea what I was doing
5 years later I tried doing it again.
But with LLMs.
Each font has bunch of letters called a glyphs
Glyphs look like this.
A glyph is made up of points, and directions on how to connect these points.
Curved letters like O’s have curves between points
This represented in a glyph table, like this. This is of the letter X
qCurveTo
→ Draw QUADRATIC BEZIER CURVE
Control 0: (310, 551)
Point 36: (310, 551)
Control 1: (366, 350)
Point 37: (366, 350)
Control 2: (471, 213)
Point 38: (471, 213)
Control 3: (616, 143)
Point 39: (616, 143)
End Point: (706, 143)
Point 40: (706, 143)I tried the simplest thing.
Told it to look at the glyph table and asked an LLM to Manipulate.
Prompt : Make this fella more Italic
It did. Kinda
Here I realized that LLMs are not very good at changing values that render to a visual form.
This reminded me of something similar. Manim + LLMs
Chris my friend made another tool (link) to make programmatic videos using Manim.
It was good but not great. It kept on overlaying multiple items on top of each other
He messaged grant Sanderson the founder of Manim and this is what he said
The value of programmatic animations is that you can do it entirely in text, so no multimodality or fancy diffusion models for video creation are necessary. But perhaps that's also the limit, when constrained like this, it's a blind man painting at a canvas.I don't know what my advice would be. For what it's worth, while I find LLMs remarkable and helpful in general when I'm doing my actual work, I rarely find them helpful in creating animations.
This kind of felt same as a blind man manipulating the vectors couldn’t see what he drew before.
Its weird I know…I think.
This made me move to a new technology diffusion models.
The successor to my beloved GANs.
And things immediately changed. It did it the first try.
Here is when I asked it to make an O with a white background
Now I have this how do I convert it to an SVG?
Then it was smooth sailing I got the SVG out in no time.
Then I asked nano banana to create 26 letters and convert them to an TTF
TTF is a file system that maps glyphs to letters that the system can understand. It also does some normalizations.
Here is what it looked like. I know the letters are reversed. But that was not the main problem.
Since the letters are not consistently at the same height that makes for a terrible font (unless you are going for something like that). Next goal was to generate multiple letters all at once. Following some constraints.
I didn’t do all at once because of the loss in quality that might cause.
I gave it guidelines such as the ascender, descender, x-height, and baseline
Good article to understand all this nonsense
https://pangrampangram.com/blogs/journal/anatomy-of-the-letterform
Made this grid using python and asked the model to fit the letters accordingly.
Did an okay job a needed to be between the base and x height. I gave it clear instructions in the prompt.
Maybe because my x height is really small and the training data of nano banana couldn’t convince it self to make an a where it is so small compared to an H.
I tried once more
Slight overshoot hahaha
Then I thought why not meet the LLM in between. Lets give it a collection of traced letters and have it use that as reference. Worked like a charm.
Then we had to extract these letters out, shoutout pillow again.
Added markers and used python to cut them.
got 60 of these little fellas.
Now we just convert them to SVGs and then create a TTF right.
But to convert them to a TTF or an OTF file (file that is given to the OS to run that font) needs to be normalized.
Should work right. NOOOOO
The normalization needs to be managed.
Look at the glyphs below. The base is nearly the same that is why it is like sticks out like a soar thumb.
What next. It needs to be normalized and I need to spend more time on it.
This project was part of my weekly sprints. I feel like this can be given some more time.
Next
Look at this n. It was made by nano banana and I kinda think it is beautiful.
If everyone has the power to make their own fonts then it will be kinda great. I want to live in that world. Font is everypart of reading a cool book or blog.
The typographer must analyze and reveal the inner order of the text, as a musician must reveal the inner order of the music he performs. — The Elements of Typographic Style, Robert Bringhurst
Imagine each Substack owner can make their own font to highlight the essence of their writing.
I also found out that my friend's company got charged $2,000 per character. WTF.
Here is the font.