AI is a fast-growing and very promising field, but it is difficult for a beginner to learn. I hope my website will help you master the essential skills in programming and AI.

I would like to share with you some articles on software and AI that I wrote. They reflect my own experience in learning them, digesting them, and my current views on them. Beware that they have my personal preferences, as other people might have more thoughtful views. I hope you find them useful. If you would like to discuss with me, please send an email to my email address in About.


THIS SITE IS UNDER CONSTRUCTION WITH SUBSTANTIAL MATERIALS TO BE ADDED SOON.


Contents


Which Programming Languages to Learn?

It could be a heated and opinioned question. We will base our view on facts. Here is the widely popular programming language ranking -- TIOBE Programming Community Index



I would like to pick the top four languages -- Python, C, C++, and Java, as a start to hone our skills. The reasons are:

Python is easy to start, and C++ is notably harder. What is the proper order to learn those languages?

I first learned Python from my great computer science teacher Mr. Forhan. I learned tons from him! I later diligently studied some intermediate/advanced Python programming. Then I learned C++, also from Mr. Forhan. I studied C cover to cover from the C creators' book -- the classic K & R The C Programming Language. I learned to apply OOP to various projects including game development. I also learned some modern software tenets like "favor composition over inheritance", Test-Driven Development, and the use of asserts for professional software coding.

I learned Java last on self-study, and spent less time on it than on Python and C++, due to time-saving conceptual similarity among those languages. I achieved a 5 out of 5 on the AP CSA Exam (Java).

Others may prefer different orders. But if I have to start again, I still would follow the same order.


How I Built This Website

Scalable Web App

There are many website-building tools geared towards non-programmers. Like WordPress and Wix, they usually are for a single purpose and are not scalable.

My goal is two-fold: 1. Provide rich and useful content; 2. Acquire the capabilities to build dynamic and scalable web applications, which, if successful, can easily be scaled (with zero or negligible changes) to run on thousand servers, as in eBay-like stores, LinkedIn-like social media webs, or big data analytics.

My thoughts on scalable applications are deeply influenced by the influential paper The Twelve-Factor App.

Especially these tenets:

V. Build, release, run
Strictly separate build and run stages
VI. Processes
Execute the app as one or more stateless processes
VII. Port binding
Export services via port binding
VIII. Concurrency
Scale out via the process model
IX. Disposability
Maximize robustness with fast startup and graceful shutdown

Flask Web Application Overview

Flask is a powerful web framework and very popular among Python developers. At a high level, it works as:

Twitter Bootstrap for Better Look

I use HTML and CSS for basic web pages, Javascript for modifying/updating web pages (as in my websocket multi-player Mini-Jeopardy game). To achieve consistent and responsive looks on all PCs, tablets, and mobile phones, I use Twitter Bootstrap tools.

Scale App by Gunicorn

I learned to use gunicorn to spawn multiple instances of applications to scale web application. So, technically, if high traffic hit, I just need to purchase more computing power, without software changes.



Please note that static files like images can be directly served by high-performance Nginx without going through Gunicorns as shown below.

Deployment

I use Nginx as load balancer for my web application. I configure systemd to automatically restart my web server if failures occur.



Nowadays, security and privacy are super important. To avoid data intercepting between users and my server, we need https rather than http. I use Letsencrypt to SSL encrypt the traffic.


Learn AI Part I

A good place to learn AI is to try an MNIST Handwritten Digit Classification problem, specifically how to build an AI model and train it to recognize handwritten digits like below:


The MNIST dataset contains 60,000 training images and 10,000 test images. Each image is in 28x28 grayscale, which we then collapse to an 764 input vector.

We then proceed to use TensorFlow to build a AI model with a 128-neuron hidden dense layer, and a 10-neuron output layer (representing number 0, 1,...,9).


'''Deep Learning of handwritten digits'''

import tensorflow as tf


# handwitten dataset
mnist = tf.keras.datasets.mnist

# get data
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# setup model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# train model
model.fit(x_train, y_train, epochs=5)

#evaluate 
test_loss, test_acc = model.evaluate(x_test, y_test)
print('test accuracy: ', test_acc)

# make predictions
test_images = x_test[100:110]
test_answers = y_test[100:110]

predictions = model.predict(test_images)
print(predictions)
print(test_answers)

Test result: accuracy 97.7%

Epoch 1/5
1875/1875 [==============================] - 18s 9ms/step - loss: 0.2988 - accuracy: 0.9127
Epoch 2/5
1875/1875 [==============================] - 16s 9ms/step - loss: 0.1443 - accuracy: 0.9572
Epoch 3/5
1875/1875 [==============================] - 16s 9ms/step - loss: 0.1079 - accuracy: 0.9673
Epoch 4/5
1875/1875 [==============================] - 16s 8ms/step - loss: 0.0890 - accuracy: 0.9729
Epoch 5/5
1875/1875 [==============================] - 16s 9ms/step - loss: 0.0745 - accuracy: 0.9759
313/313 [==============================] - 2s 6ms/step - loss: 0.0765 - accuracy: 0.9773
test accuracy:  0.9772999882698059

A digit is recognized by the highest probibility:


Below are 10 sample image outputs, with ten probabilities for 0, 1, 2, ..., or 9 for each image.

[[9.7797150e-05 2.7502747e-06 2.0719497e-04 5.4697362e-06 3.9134866e-06
  1.0565845e-03 9.9850607e-01 9.4980851e-06 1.1081668e-04 6.5774897e-09]
 [9.9999189e-01 5.0800132e-12 3.8317609e-07 1.2813229e-09 2.6715312e-12
  4.8048241e-06 2.6085286e-06 1.8676683e-07 1.0828911e-09 1.7154665e-07]
 [2.4937076e-13 1.0247718e-08 1.8332835e-11 5.0632691e-05 1.1348117e-13
  9.9994743e-01 3.8600897e-10 1.8920252e-09 5.2921104e-07 1.3357593e-06]
 [1.0725232e-11 3.3831743e-10 3.0339891e-06 3.3242240e-10 9.9999702e-01
  4.7549489e-11 2.1081385e-10 2.4093074e-09 7.0132627e-10 5.4844653e-09]
 [6.9928765e-09 5.3030424e-09 4.5065480e-07 1.1416088e-02 5.9557737e-05
  1.8310407e-03 1.7897968e-07 1.2334714e-05 3.8087992e-03 9.8287153e-01]
 [6.0850289e-09 2.3812461e-09 2.7127066e-06 6.7118823e-04 1.1123864e-04
  5.3052686e-06 1.0193106e-09 4.6916386e-05 8.7647959e-06 9.9915385e-01]
 [1.8005048e-06 2.2045628e-08 9.9951398e-01 4.0768928e-04 1.2246666e-09
  1.6332680e-07 9.4513730e-09 1.8335189e-05 5.8022240e-05 9.1493292e-11]
 [1.6277669e-04 9.9575365e-01 3.8866757e-04 1.8802883e-05 6.7979881e-06
  1.3034698e-04 1.1203440e-03 7.6226373e-05 2.3423142e-03 4.5040363e-08]
 [7.6668460e-10 7.5261031e-10 4.6100594e-09 2.2274423e-06 4.3585594e-04
  1.7071790e-05 5.0292753e-11 8.0259952e-06 8.4089845e-07 9.9953604e-01]
 [5.4633720e-06 2.5156397e-09 2.6755963e-06 8.7614289e-08 9.9923217e-01
  6.8277529e-07 3.3744705e-06 3.5353223e-04 1.8205647e-07 4.0188371e-04]]

As we can verify, AI correctly identifies them with high confidence as:

[6 0 5 4 9 9 2 1 9 4]  (indicated above in bold fonts)

Other probabilities are significantly smaller, i.e., unlikely to be other digits.

More sophisticated models can even achieve higher accuracies. This wikipedia page on MNIST database shows the historical records by various researchers and their models.



Learn AI Part II


Learn AI Part III


Linux Commands to Master

As a samurai must know his sword, we programmers must master our tools.

Here is a concise summary of common UNIX command


A pdf version can be downloaded here Unix/Linux Command Reference by FOSSwire.com

Commit a few days to get familiar with them.


Learn Git -- Modern Distributed Source Control Tool

I consider it a must. Otherwise, it is too laborious, error-prone, or even impossible to manage source codes for any projects with more than a few files.

The basic git concept can be shown below:

Some of the most common git commands are nicely summarized by REBELLABS in the below figure:


To learn more, check out the official Pro Git Book