|
| 1 | +# AI Dev |
| 2 | + |
| 3 | +I started my dive into AI in 2008 writing a Boid / Crowd system for my thesis while in art college, School of Visual Arts. |
| 4 | + |
| 5 | + It was an insane particle script + 3d animation cycles in Maya haha. |
| 6 | + |
| 7 | +Then I did Boid movement, navigation, & obstacle detection in animated films for 5 years at Blue Sky Studios, using Houdini. |
| 8 | + |
| 9 | +I dove into Style-Transfer AI & Long Short-Term Memory (LSTM) training in 2019-2020, |
| 10 | + |
| 11 | + Like making a Node.js server (web site) understand my voice & auto google search for me. |
| 12 | + |
| 13 | +Since then, I've been developing different multi-media AI structures in my spare time. |
| 14 | + |
| 15 | +In 2015 I decided I'd cram a machine learning AI into a single-board computer, a Jetson TK1, by the end of 2026. |
| 16 | + |
| 17 | + Something that could write down what I say, |
| 18 | + |
| 19 | + Use vision to understand an object simply went out of frame. |
| 20 | + |
| 21 | + Yet "knows" if it looks over, the object is still there; |
| 22 | + |
| 23 | + 'Long Term Attention' |
| 24 | + |
| 25 | +At the end of 2023, this evolved into a deep learning AI crammed into, likely, a Jetson Nano. |
| 26 | + |
| 27 | + As something to infer what I mean, from what I say, |
| 28 | + |
| 29 | + Or give a "thought" on what it saw or heard in the world around it. |
| 30 | + |
| 31 | + 'Machine Learning' is AI that can learn basic patterns. |
| 32 | + |
| 33 | +'Deep Learning' is Machine Learning, |
| 34 | + |
| 35 | +But uses neural networks to form patterns of patterns. |
| 36 | + |
| 37 | +Realistically, I'd just be happy to make something that can understand what I say and can give a semi coherent response without an internet connection. |
| 38 | + |
| 39 | +I'm yet to begin on the core of the AI, as I'm still testing different structure's ability in adapting to stimuli. |
| 40 | + |
| 41 | + You could guess, |
| 42 | + |
| 43 | +All the recent AI stuff has been quite serendipitous for my creation! |
| 44 | + |
| 45 | +For my 2026 goal, I've been exploring Graph Attention Network (GAT) artificial intelligence. |
| 46 | + As GATs allow me to treat 'concepts' as 'objects', rather than sections of words/pixels as a tensor or 'piece of a concept'. |
| 47 | + |
| 48 | + GATs are a type of neural network that considers the relationships between data points. |
| 49 | + |
| 50 | +As a type of Graph Neural Network (GNN), |
| 51 | + |
| 52 | +Its best for predicting connections between ideas / things / data in a system. |
| 53 | + |
| 54 | + GNNs are commonly used for "Recommendation Systems", |
| 55 | + |
| 56 | + Hey, you might know Jim Bob McGee!! |
| 57 | + |
| 58 | + But GATs could be used for so much more! |
| 59 | + |
| 60 | +I've been working on a general-purpose neuron that adjusts its own connections during prediction; |
| 61 | + |
| 62 | + So the same system could learn my voice on the fly, as well as sensor signals connected to the Jetson computer. |
| 63 | + |
| 64 | +Since its the Structure in a GAT that causes regions of neural activation based on stimuli, |
| 65 | + |
| 66 | + It forms a result (prediction) after subsequent activations, as-though compounding ripples in a pond. |
| 67 | + |
| 68 | +Rather than a field of numbers aligning to yield a prediction, |
| 69 | + |
| 70 | + It's the structure of neural connections which manipulates the data. |
| 71 | + |
| 72 | +I've been going in a direction that should yield a similar result to a Recurrent Neural Network (RNN), but with a different mental structure. |
| 73 | + |
| 74 | + With that general-purpose neuron, I can provide text, images, audio histograms, etc. to the network. |
| 75 | + |
| 76 | +RNNs can be used for/in nearly any ai, |
| 77 | + |
| 78 | +Best for detecting patterns in sequential data, |
| 79 | + |
| 80 | +Like time-based events or words in text. |
| 81 | + |
| 82 | + They are the basis for many types of ai, like LSTMs; |
| 83 | + |
| 84 | +And can be used as part of LLMs, like ChatGPT. |
| 85 | + |
| 86 | + The GAT will create connections from initial random data points, sample the differences, then pass the 'prediction' forward and 'back' in the chain, and adjust the connections based on their revisit to the same data in the current 'prediction'. |
| 87 | + |
| 88 | + Relying on localized regions of sub-networks to recurrently process the data |
| 89 | + |
| 90 | +It should be self-taught discrimination of attention between neurons; |
| 91 | + |
| 92 | + Like in the human brain. |
| 93 | + |
| 94 | + (When the purple circles go red in the GAT video, first vid) |
| 95 | + |
| 96 | +How about an Echo State Network (ESN) AI I wrote in the spring-summer of 2024? |
| 97 | + |
| 98 | +An ESN is a type of RNN, |
| 99 | + |
| 100 | +Which considers time in its prediction. |
| 101 | + |
| 102 | +It thinks about past events to predict future events. |
| 103 | + |
| 104 | +Since an ESN brain can learn on the fly, why not feed it some videos I made? |
| 105 | + |
| 106 | + Currently I'm not using my ESN's predicted movement for anything in python, |
| 107 | + |
| 108 | + The next step would be introducing a base image to motion-transfer / reference. |
| 109 | + |
| 110 | +However did build a simple version in Unity to learn player combos + movement over time. |
| 111 | + |
| 112 | + So I'm mostly just learnin' while watching my ai learnin'! |
| 113 | + |
| 114 | +In the videos, I had the "reservoir" set to 15 times steps, you'll notice about every 15 frames the brain shifts. |
| 115 | + |
| 116 | +By frame ~45, it's learned some patterns in the X video. |
| 117 | + |
| 118 | +The brain seems to completely melt at ~75 & rebuild itself by ~95. |
| 119 | + |
| 120 | +It should be happenstance that the brain shifts when the reservoir fills; |
| 121 | + |
| 122 | +The brain should shift, but the 15-frame fill might be a bug in my logic, |
| 123 | + |
| 124 | + Or maybe its just a coincidence ::shrugs:: |
| 125 | + |
| 126 | +But it's detecting patterns in motion! |
| 127 | + |
| 128 | +If you couldn't tell, I'm training my AIs on my own works. |
| 129 | + |
| 130 | +A personally made AI trained on personally made images / videos / photos / code / writing. |
| 131 | + |
| 132 | + That means I can copyright my generations, right? |
| 133 | + |
| 134 | + If I made every aspect of the AI & training data? |
| 135 | + |
| 136 | + - February 2025 |
| 137 | + |
| 138 | +I've begun on the core of the AI, as of May 24th, 2025. |
| 139 | + |
| 140 | + I have the beginnings of a 'Micro-Term' memory implemented to act as a gated-attention during inference. |
| 141 | + |
| 142 | +This, paired with automatic graph edge splitting ('Dynamic' in DGNN or DGAT) and use of geometric clustering, seems to be giving me values of a "remembered" object when it's outside of the dataset. |
| 143 | + |
| 144 | + Bodily awarness of limbs, objects outside of the field of view, and other 'long term' tensors/classifications at a temporary scale. |
| 145 | + |
| 146 | +It's a 4d kernel, in that it uses an ESN to train on it's own mistakes, |
| 147 | + |
| 148 | + Basing it's decisions on prior back-propagation states/adjustments. |
| 149 | + |
| 150 | + The beginnings of a meta-learning process, hehe. |
| 151 | + |
| 152 | +I'm using a method I'm calling 'Stack Crunching', |
| 153 | + |
| 154 | + Where I agregate the time dependent weights into a "checkpoint" of sorts. |
| 155 | + |
| 156 | + This allows the ESN to have a 'baseline' understanding of data that I can parse into with vectors calculated from tensor weights found within a quantized version of the input data. |
| 157 | + |
| 158 | +You can assume that the 'ESN' is not a standard 'Echo State Network' anymore. |
| 159 | + - May 2025 |
0 commit comments