Building AI Models with Python and Keras

Опубликовано: 22 Май 2024
на канале: Stephen Blum

537

If you're trying to understand how artificial intelligence and machine learning function through some hands-on Python coding, here's a straightforward example I like to use when building any AI models. The method is straightforward, and I like to keep it simple, especially for beginners. Let's assume we've already got our training data prepared and vectorized.

Our input dataset consists of sentences and our targets are clearly defined. We have our X and Y variables, which are our features and labels. Once everything is set, we can start defining the model using a typical model, the sequential model.

This model uses Keras, a high-level spine that sits atop Tensorflow. You might also consider using Pytorch sequential for defining your ability to add multiple layers to the model. We then face several hyperparameters here.

Depending on the size of the model, we can boost its performance by supplying it with sufficiently large numbers of units. Our matrix with input dimensions, in this case, is 72. The input data must be shaped according to these dimensions.

The dense layer is just a regular matrix that can contain any number of units or floating-point values. We have three dense layers ticking over here, all of different sizes. The size of the first one will be determined by the size of the input dimensions and the number of hidden units.

In our example, the first layer has 720 units. So, the input will contain 720 parameters for this first layer. In contrast, the second layer will have 720 by 720 parameters.

While the number of parameters may seem enormous, we cannot use a model of this scale considering the limited amount of training data we have in hand. As we feed more and more data, we need more parameters to develop a deeper understanding of the input data. The input matrix will then give way to an output that will be multiplied onto the next matrix, and so on.

The trick here is to prevent the model from failing by controlling the numbers. To that effect, we use an activation that contains the values within a bounding box. The last bit of our model contains optimization details.

We are implementing the standard gradient descent optimizer. This optimizer shuffles all our input data and feeds it to the neural network. Based on the loss function, it will make iterative amendments to all the matrices according to our learning rate.

This completes our model. You can always make it more complex by adding more stuff.