# Building a Character-Level Language Model with LSTMs in TensorFlow

##### Pinned 19 February 2017
neural-networks
lstm
nlp

I let a neural network read long texts one letter at a time. Its task was to predict the next letter based on those it had seen so far. Over time, it recognized patterns between letters. Find out, what it learned, by feeding it some letters below. When you click the send button on the right, it will read your text and auto-complete it.

You can choose between networks that read a lot of Wikipedia or US Congress transcripts etc.

 Generate text from ...

# Intuitive Explanation of the Gini Coefficient

##### 10 October 2017

The Gini Coefficient is a popular metric on Kaggle, especially for imbalanced class values. But googling "Gini Coefficient" gives you mostly economic explanations. Here is a descriptive explanation with regard to using it as an evaluation metric in classification. The Jupyter Notebook for this post is here.

Spoiler Alert: The Gini coefficients are the orange areas. The normalized Gini coefficient is the left one divided by the right one.

# iCloud Key-Value Storage in Swift 3 / iOS 10 / Xcode 8

##### 09 August 2017
app-dev
The iCloud key-value storage is like the UserDefaults but synced across devices. It also survives uninstalls of the app. I used it today to add iCloud backups to Emoji Diary. It required 4 lines of code.

### Activate the iCloud capability

Select your project in Xcode and then select the target under "Project and Targets". Activate iCloud and check "Key-Value storage".

# Directory Structure for Jekyll / GitHub Pages with Gulp and Babel

##### 07 August 2017
coding
web-dev
• Jekyll is the most popular static site generator.
• Gulp lets you automate your build process (minifying .css files, concatenating all .js files etc.).
• Babel lets you write ES6 (cool new JavaScript) even though not all browsers support it by transpiling it into ES5 (lame old JavaScript).

I switched to using Jekyll for this blog yesterday and I love it. There are blog posts describing how to combine a subset of the above, but I could not find one for combining all four.

# Validation and Batch Learning

## Part 8 of the series "Building a Neural Network from scratch"

##### 30 October 2016
neural-networks
theory

In this post we will continue with digit recognition and try to come closer to the benchmark of 99.8% accuracy. In the last post we already ran a test run with a network consisting of 300 hidden neurons and 10 output neurons. After only 15 epochs, it already reached an accuracy of 95% on the test set. However, in the following epochs, the net overfitted, which I will display in detail below. Furthermore, the optimization of the ca. 240,000 parameters took a significant amount of time per epoch. This slows down the process of finding the correct hyperparameters, such as the learning rate or the number of hidden neurons. This post will be about validation, which we can use to reduce overfitting, and batch learning, which speeds up the training phase.

# Using Softmax for Multiple Classes

## Part 7 of the series "Building a Neural Network from scratch"

##### 25 October 2016
neural-networks
theory

In this and the next posts, we will do digit recognition using the MNIST dataset. It consists of 70,000 images of handwritten digits (0 to 9). Our current net outputs one value, so in this post, we will modify it. Thus, it will be able to tackle classification tasks with more than two classes.

# Experimenting with Hyperparameters

## Part 6 of the series "Building a Neural Network from scratch"

##### 24 October 2016
neural-networks
theory
In this post, we will experiment with our neural network. We will test out values for hyperparameters such as the learning rate and the number of hidden neurons. Read more

# Implementing Backpropagation in a Neural Network

## Part 5 of the series "Building a Neural Network from scratch"

##### 19 October 2016
neural-networks
theory

How do neural networks learn? So far, we implemented a single neuron and derived the update rules for its weights, i.e. let it learn. In the last post, we created a neural network consisting of three neurons and already implemented the generation of its outputs. In this post, we will find out, how to update the weights in the neural network. Thus, we will enable our network to learn by adding just 8 lines to the code.

# When One Neuron Is Not Enough

## Part 4 of the series "Building a Neural Network from scratch"

##### 18 October 2016
neural-networks
theory

In the last post, we talked about linear separability. We observed that our neuron fails to learn not linearly separable datasets like the XOR dataset. In this post, we will expand to a net of neurons that can learn more complex functions – a neural network.

# From Regression To Classification

## Part 3 of the series "Building a Neural Network from scratch"

##### 17 October 2016
neural-networks
theory

We successfully extended our neuron, so that it can handle all datasets generated by a linear continuous function. Now, we will move to classification tasks, meaning that the target values in the dataset are discrete and not continuous.

### Regression vs. Classification

So far, all problems were regression tasks. Our neuron was given two input variables, which it used to predict a continuous target variable. In the last problem of the last post, we introduced a classification problem. The solution for an input vector was either 0 or 1. Our neuron failed to solve this problem because it can only fit a linear function for solving the problem. It generates a least squares fit. For linear regression tasks, this suffices, but for classification problems, the output value should not change linearly, but abruptly.

# Flexible Neurons

## Part 2 of the series "Building a Neural Network from scratch"

##### 16 October 2016
neural-networks
theory
In the last post, we created a neuron that was able to learn a dataset generated by a simple linear function $$y = 0.58 x_1 + 0.67 x_2$$. Now, we will modify the function behind our dataset just a bit and suddenly our neuron fails to predict the target variable accurately. We will identify the problem and modify the neuron. Thus, we will make the neuron more flexible. $$\vec{\hat{y}} = \vec{x} \cdot \vec{w} \color{orange}{+ \vec{b}}$$ Read more

# The Neuron

## Part 1 of the series "Building a Neural Network from scratch"

##### 15 October 2016
neural-networks
theory
This is the first post of a series about understanding Deep Neural Networks. We will start with the core component of artificial neural networks - the neuron. We use a single artificial neuron to learn a simple dataset.

Let's say we want to predict how much money we will probably earn in 2016. We take a look at our financial data and see that there seems to be some kind of correlation between our gross salary, our heritage and how much money we actually earn in that year. But instead of googling tax rates, we decide to throw Machine Learning at the problem and hope that it does all the thinking for us.
Year Gross Salary [$] Heritage [$] Net Income [\$]
2011 80,000 10,000 53,100
2012 85,000 5,000 52,650
2013 85,000 0 49,300
2014 120,000 30,000 89,700
2015 140,000 0 81,200

# Java's Data Structures

##### 20 April 2016
coding
This week, I will have a technical interview for a Software Engineering internship. To solve coding problems it is essential to remember the most frequently used data structures. It is even better to know how they are implemented. So, in order to structure my thoughts, I will cover the major data structures and their implementation in Java 8. Read more

# Visualizing Travel Times with Multidimensional Scaling

##### 13 January 2016
visualization
Which map is correct?
In a geography exam, the correct answer would be the left / upper one. It displays the actual positions of four cities in the US. But that does not make the other map incorrect. It just displays other data. Specifically, it approximates the travel times between the four cities. This means that the closer two cities are on the right map the faster you can travel between them with public transport. We can calculate such maps using Multidimensional Scaling. What is Multidimensional Scaling? How can it help us to approximate travel times? And what is the relationship between the left map with the actual positions and the right map? We are about to find out. Read more

# 8 facts I did not know about Java

##### 11 December 2015
coding
I found out some helpful and some surprising properties of Java 6. Did you know that to this day, Java has Go To statements? Here are eight facts about the language I was not aware of. Would you have known all of them? Read more