Piping Bitcoin RPC Commands

If you want to get the block header of the latest block generated on the bitcoin blockchain using bitcoin-cli it’s a little tricky (and hard to say!). You need to first find the latest block number (height), then find the hash of that block and then get the header using the hash.

Since the getblockheader command expects a blockhash as a parameter I use pipes to feed the result of one command into the next.

The pipe runs the following commands in this order.

  • First get the chain height using getblockcount
  • Feed this result to getblockhash to get the hash
  • Feed this result to getblockheader
  • Result is the header of the latest block

The result is a one line command to get the latest block header!

$ bitcoin-cli getblockcount | xargs bitcoin-cli getblockhash | xargs bitcoin-cli getblockheader
{
  "hash": "00000000000000000001e372ae2d2bc91903bd065d79e126461cd2bf0bbe6b3d",
  "confirmations": 1,
  "height": 600417,
  "version": 545259520,
  "versionHex": "20800000",
  "merkleroot": "e58f963d486c0a626938851ba9bfb6e4886cabcf2302573f827ca86040f997a3",
  "time": 1571688192,
  "mediantime": 1571685251,
  "nonce": 1693673536,
  "bits": "1715a35c",
  "difficulty": 13008091666971.9,
  "chainwork": "000000000000000000000000000000000000000009756da038619f842bfff6b6",
  "nTx": 2577,
  "previousblockhash": "0000000000000000000dbd8aada824ee952e87ef763a862a8baaba844dba8af9"
}

IoT with Node-RED and Python

Raspberry Pi + Node-RED + Python + MQTT

Now I have two Raspberry Pis running, one as a Bitcoin full node and the other mostly used as a dev/experimentation machine I decided it’s time to put the dev machine to some use.

I’d also like to learn more about IoT (Internet of Things) and how they are wired together and communicate so this is a great opportunity to ‘Learn by Doing’.

To this end I’ve started to experiment with the MQTT messaging protocol that is commonly used for IoT devices.

To start, what is MQTT?

MQTT (Message Queuing Telemetry Transport) is an ISO standard, lightweight, publish-subscribe network protocol that transports messages between devices.

MQTT on Wikipedia

This allows us to very easily send sensor data between devices without having to invent the communication medium ourselves. Most IoT gateways support MQTT out of the box and it’s widely supported across many programming languages (list here).

As a test I’ll create a Node-RED flow on my Raspberry Pi that will publish (send) messages to a local MQTT server, these messages will then be ‘read’ by a python script running on my Windows laptop. I’ll also add a flow where the python script on Windows publishes messages that are then read by the Node-RED flow.

Node-RED Flow

MQTT Node-RED flow
MQTT Node-RED flow

MQTT in and out nodes are included as part of the standard installation on Node-RED so creating a flow is trivially easy. All the MQTT part is contained in a single node while the rest of the flow is just creating the message to send.

Publish Flow

MQTT Publish flow
Publish flow

The inject nodes are just to manually trigger the flow. The true trigger causes the exec node to execute a command on the Raspberry Pi, in this case it gets the system temperature. This is then published to the MQTT server in the ‘iot‘ topic.
The command to get the system temperature on a Raspberry Pi is shown here.

$ /opt/vc/bin/vcgencmd measure_temp

Topics in MQTT are just ways to keep different messages together, if you publish to a specific topic then other clients subscribed to the topic will receive the messages.

Subscribe Flow

MQTT Subscribe flow
Subscribe flow

The lower two nodes are used to subscribe to a topic I’ve called ‘python‘. This is triggered when the python script publishes to the topic and the message will be outputted to the debug console in Node-RED.

Configuring the MQTT Nodes

By default the MQTT nodes use a local server on port 1883 that is already set up for you. Unless you want to use your own server or a remote server just leave these as-is. The topic is entirely up to you, just make sure you use the same topic in the client used to read the messages.

MQTT server configuration
MQTT server configuration

MQTT Python Script

For the python client running on my laptop I’ll use the Eclipse Paho library. To install use:

pip install paho-mqtt

The full script looks like this.

import paho.mqtt.client as mqtt
import os

# The callback for when the client receives a CONNACK response from the server.
def on_connect(client, userdata, flags, rc):
    print("Connected with result code "+str(rc))

    # Subscribing in on_connect() means that if we lose the connection and
    # reconnect then subscriptions will be renewed.
    client.subscribe("iot")

# The callback for when a PUBLISH message is received from the server.
def on_message(client, userdata, msg):
    print("Topic: {} / Message: {}".format(msg.topic,str(msg.payload.decode("UTF-8"))))
    if(msg.payload.decode("UTF-8") == "Reply"):
        client.publish("python", os.environ.get('OS',''))

client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message

# Use the IP address of your MQTT server here
SERVER_IP_ADDRESS = "0.0.0.0"
client.connect(SERVER_IP_ADDRESS, 1883, 60)

# Blocking call that processes network traffic, dispatches callbacks and
# handles reconnecting.
# Other loop*() functions are available that give a threaded interface and a
# manual interface.
client.loop_forever()

The code is well commented but essentially it creates a connection to the MQTT server (created by the Node-RED flow on my Pi). Replace the IP address with your local server or use 127.0.0.1 if the script runs on the same computer as the server.

The script then waits for messages in the ‘iot‘ topic and when received it prints the message to the console. If the message is ‘Reply’ then the script also publishes a message (the Windows OS version) to the ‘python‘ topic which will be picked up by the Node-RED flow and displayed there.

Putting it Together

To start sending and receiving messages first deploy the Node-RED flow and then start the python script. Running the python script returns this showing the script is now waiting for messages.

>python mqtt.py
Connected with result code 0

Injecting the ‘true‘ node will query the Pi for the system temp and send this to the ‘iot‘ topic on the MQTT server which the python script will pick up and display as shown below. Here I ran the flow four times so we get four messages with temperatures displayed in python on my laptop.

Topic: iot / Message: temp=48.3'C
Topic: iot / Message: temp=48.3'C
Topic: iot / Message: temp=48.9'C
Topic: iot / Message: temp=48.3'C

If I now send the ‘Reply‘ message from Node-RED we see this in python.

Topic: iot / Message: Reply

In Node-RED we see a debug message with the message sent from python to the ‘python‘ topic we subscribed to in Node-RED (β€œWindows_NT”).

Node-RED debug output

Testing from iOS

In the app store there are quite a few MQTT clients available. I tried a few but MQTTool was the most reliable for me. It allows you to connect to a server and both publish and subscribe to topics. Just connect to your MQTT server and test!

Next Steps

This was a trivial example of using MQTT to send and receive messages but the next plan is to extend this with sensor data that can be send to Node-RED running on a virtual server.

This way I can securely make sensor data available form the internet as well as choosing to store the data in a database or cloud storage service.

Introduction to Image Classification using UiPath and Python

Image classification
A python!

After my previous post showing image classification using UiPath and Python generated many questions about how to implement the same I decided to expand upon the theme and give a more detailed description about how to achieve this.

My starting point was thinking how I might integrate UiPath with python now that it’s integrated within the platform.

I find thinking of potential solutions and use cases just as much fun as actually making the automations and it feeds the creative mind as well as the logical.

Python is also growing explosively right now and this leads to a vast array of possibilities.

To see just how python is growing see this article from Stock Overflow.

Programming language popularity
Python Growth!

The first thing to point out is that although I used UiPath as my RPA platform this could in theory be any platform that supports python. I also use Alteryx and this could easily be integrated into an Alteryx workflow to programatically return the classifications and confidence levels.

Note that these instructions cover installation on Windows but the python and Tensorflow parts could easily be done on Mac or Linux, just follow the instructions.

Basic Requirements

Python

Obviously this won’t work without Python being installed πŸ™„ I used Python 3.5 but any Python 2 or 3 version should work. Download Python for free and follow the installation instructions. There’s a useful guide to the installation process at realpython.com. This is easy and will take 10 minutes to get going.

Tensorflow

This is the python library that does the heavy lifting of the image classification. Or in the words of Wikipedia:

TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks.

Wikipedia

It’s completely free to use and released by Google. I won’t go into the installation in detail since it’s well documented on the Tensorflow site, but the whole process took me another 10 minutes. Easy as πŸ₯§

Python 3.X installs the pip package manager by default so in my case installing Tensorflow was as simple as typing the following command into the command line.

pip install --upgrade tensorflow

UiPath

RPA (Robotic Process Automation) is also growing exponentially right now so it’s a great time to learn how it works and what benefits it can bring.

The RPA software market overall witnessed a growth of 92 to 97 percent in 2017 to reach US$480 million to $510 million. The market is expected to grow between 75 and 90 percent annually up to 2019.

If you don’t already have RPA software and want to integrate into an automated solution you can download and use the community version of UiPath for free (Windows only).

UiPath is very powerful and yet easy to use, but apart from the technology a major advantage they have is the large and growing community so solutions are often posted to their forums. On top of that they have free training available online. What’s not to like?

Download the Tensorflow Models

Let’s get into the details now starting with downloading the models we will use.

If you use git you can clone the Tensorflow models repository using this link; https://github.com/tensorflow/models

Otherwise you can use a browser and navigate to the above page and then download the models as a zip file using the link in the top right corner.

Clone from Github

Save the zip file to your computer and unzip somewhere on your C: drive. The location isn’t important as long a you know where it is.

Download the Pre-trained Model

This can be done in one of two ways, either:

Method 1

Find the location of the unzipped models file from the previous step and go into the following directory (in my case the root directory is called models-master), models-master > tutorials > image > imagenet

Once there open a command prompt and run the python file called classify_image.py

imagenet output
Your first classified image

If all goes to plan this downloads the pre-trained model to your C drive and saves it to C:\tmp\imagenet, it also runs the classifier on a default image in the downloaded folder. As you can probably work out the image is of a panda πŸ™‚

giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89632)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00766)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00266)
custard apple (score = 0.00138)
earthstar (score = 0.00104) 
panda

If you got this far, well done, you’ve already done image classification using python and Tensorflow!

If you get warning in the output, as I did, you can safely ignore these assuming the classifier still produces the output. These are purely because we are using the default Tensorflow library that is designed to work across as many CPUs as possible so does not optimise for any CPU extensions.

To fix these you would need to compile Tensorflow from source which is out of scope for this tutorial (see here for more info, https://stackoverflow.com/questions/47068709/your-cpu-supports-instructions-that-this-tensorflow-binary-was-not-compiled-to-u)

Method 2

Alternatively you can take a shortcut and download the pre-trained model directly from here, http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz

Extract the files (you’ll need something like 7-zip for that) and save them to C:\tmp\imagenet so it looks like this:

Modify the Image Classifier for Automation

The classify_image.py python script could easily be used directly with python but as it stands the script only prints the data to the command line and does not return any data from the function.

We could just change the script to write the output to a text file which UiPath could read, but it’s much cleaner and more efficient if we alter the code to return a list from the classifier function which can be converted into a .NET object in UiPath.

This also gives us the advantage that we can load the script just once into UiPath and then call the classifier function each time we need to classify an image saving considerable time and resources.

The modified python file (‘robot_classify_image.py‘) can be downloaded from my github repository, https://github.com/bobpeers/uipath, and placed somewhere where it can be called from your automation workflow.

To test the file works you can call it from a command line as follows.

C:\>robot_classify_image.py <full_path_to_image> <number_of_predictions>

For example this will return three predictions on the bike image.

C:\>robot_classify_image.py "C:\work\UiPath\Images\images\bike.jpg" 3

By default the script will not print the results to the console but if you wish to see them simply uncomment the print() line in the script:

for node_id in top_k:  
    human_string = node_lookup.id_to_string(node_id)
    score = predictions[node_id]
    #enable print for testing from command line
    #print('%s (score = %.5f)' % (human_string, score)) 
    returnValue.append("{0:.2%};{1}\n".format(score,human_string))

Note that if you saved the pre-trained model somewhere other than C:\tmp\imagenet you can edit the python to point to the saved location by replacing all the instances of ‘/tmp/imagenet/‘ with your path (be sure to keep the forward slashes).

UiPath Workflow

Most of the hard work is now done. It’s only left for us to integrate this into a UiPath workflow which is simple 😊

Use a Python Scope

All the python activities must be contained inside a Python Scope container.

Set the path to your installation path and target to x64 for 64 bit systems or x86 for 32 bit. Leaving the version as auto will auto-detect your version.

Load the python script

First we load the python script into UiPath using a Load Python Script activity.

Setting the result of this activity to a variable, in my case called pyScript.

Invoke the Python Method

Next we use the Invoke Python Method activity that will actually run the classification method.

In the Invoke Python Method activity enter the name of the function to call (‘main‘) along with the script object from above as ‘Instance’.

The function ‘main’ expects two arguments (‘Input Parameters’), the full path to the image file and the number of predictions required, sent as an array using variables in my case.

The function returns a Python Object called pyOut in my case.

Get Return Values

The Get Python Object takes the returned value (pyOut) from the previous activity and converts it into an array of strings (which is the return value from python)

We can then loop through the array and extract each line from the prediction and use for further processing or display on a callout as I did in the video.

All finished, take a coffee on me πŸ˜…

Summary

Once the basics are set up, using the classifier is extremely easy and returns values very quickly. As you look at more images you’ll also realise that sometimes the model is not certain on the results so make sure you check the confidence level before continuing processing.

My suggestion would be to automate anything over 80-90%, depending on the use case of course, and putting everything else aside for manual handling.

The classifier uses about 1000 classes to identify objects but you could always retrain the classifier on your own images. The Tensorflow documents are here, https://www.tensorflow.org/hub/tutorials/image_retraining if you want a challenge.

Have fun πŸ€–πŸ’ͺ

How to use Amazon S3 from Node-RED

Node-RED + Amazon S3

Amazon S3 (Simple Storage Service) is a very commonly used object storage solution that’s cheap to use and highly reliable. Think of it as a file system in the cloud with enterprise features that you can use to store almost anything.

Amazon S3

This guide assumes you already have a working Amazon S3 account and you have created a storage bucket along with a user authorized to read and write to the bucket. You must also have the Key ID and Secret Key for the user so we can authenticate from Node-RED.

Node-RED Flow

Open Node-RED and add the node-red-node-aws palette. This will install nodes for reading, writing and watching for events in your bucket.

To test create a simple flow like below where you input some data using the inject node, append the data to a text file and then upload the file to your S3 bucket using the amazon s3 out node.

Node-RED test flow
Node-RED test flow

The configuration of the amazon S3 out node should look like this:

  • AWS is where you enter your AccessKeyID and Secret Access Key
  • Bucket is the name of the S3 bucket you created
  • Filename is the name of the file you want to create in S3 including any folder path
  • Local filename is the file you wish to upload
  • Region is the AWS region you S3 bucket is located in
Amazon S3 Out Configuration
Amazon S3 Out Configuration

That’s all there is to it, when you deploy and run the workflow the inject node will append the timestamp to the end of the upload.txt file and then upload the file to S3.

If you log into the S3 console you’ll see the file and contents.

Amazon S3 Console
Amazon S3 Console

Previewing the file contents in S3 shows the appended timestamps.

Amazon S3 file preview
Amazon S3 file preview

Linux Command Line Calendar

I’ve used Linux for almost 20 years and somehow never knew you could get a calendar on the command line 🀯🀯

Just type β€˜cal’ for the current month or cal followed by the year (β€˜cal 2019’ for example) to get a full year. See the man pages for details.

me@myserver:~$ cal 2019
                             2019
       January               February               March
 Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
        1  2  3  4  5                  1  2                  1  2
  6  7  8  9 10 11 12   3  4  5  6  7  8  9   3  4  5  6  7  8  9
 13 14 15 16 17 18 19  10 11 12 13 14 15 16  10 11 12 13 14 15 16
 20 21 22 23 24 25 26  17 18 19 20 21 22 23  17 18 19 20 21 22 23
 27 28 29 30 31        24 25 26 27 28        24 25 26 27 28 29 30
                                             31
    April                  May                   June
 Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
     1  2  3  4  5  6            1  2  3  4                     1
  7  8  9 10 11 12 13   5  6  7  8  9 10 11   2  3  4  5  6  7  8
 14 15 16 17 18 19 20  12 13 14 15 16 17 18   9 10 11 12 13 14 15
 21 22 23 24 25 26 27  19 20 21 22 23 24 25  16 17 18 19 20 21 22
 28 29 30              26 27 28 29 30 31     23 24 25 26 27 28 29
                                             30
     July                 August              September
 Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
     1  2  3  4  5  6               1  2  3   1  2  3  4  5  6  7
  7  8  9 10 11 12 13   4  5  6  7  8  9 10   8  9 10 11 12 13 14
 14 15 16 17 18 19 20  11 12 13 14 15 16 17  15 16 17 18 19 20 21
 21 22 23 24 25 26 27  18 19 20 21 22 23 24  22 23 24 25 26 27 28
 28 29 30 31           25 26 27 28 29 30 31  29 30
   October               November              December
 Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
        1  2  3  4  5                  1  2   1  2  3  4  5  6  7
  6  7  8  9 10 11 12   3  4  5  6  7  8  9   8  9 10 11 12 13 14
 13 14 15 16 17 18 19  10 11 12 13 14 15 16  15 16 17 18 19 20 21
 20 21 22 23 24 25 26  17 18 19 20 21 22 23  22 23 24 25 26 27 28
 27 28 29 30 31        24 25 26 27 28 29 30  29 30 31

Bitcoin RPC Commands over SSH Tunnel

SSH Port Forwarding Explained

If you’re running a Bitcoin full node and want to run RPC commands against the Bitcoin client from a remote machine the easiest and safest way to do this is using Port Forwarding over an SSH connection.

What is Port Forwarding used for?

Secure access to a port that is otherwise not listening on a public network interface. This is common with database servers like MySQL.
Encryption for for services that may not natively use encrypted connections.

Port Forwarding – https://docs.termius.com/termius-handbook/port-forwarding

This also gives you the flexibility of using Python (or another language) from the remote machine without having to install it on the Bitcoin node.

In my case I’m going to use Python in a Juptyer Notebook to query the node using Termius as the SSH client.

The Bitcoin node is running on my local network and does not accept RPC commands from the internet, but using port forwarding I’ll be able to query it from my laptop from any location.

Install an SSH Client

On Windows I recommend Termius as it’s very easy to use and has a nice graphical interface (it’s also available for Mac, Linux, Android and iOS) but you could use any SSH client (PUTTY for example).

First create an SSH host to the Bitcoin full node.

Termius hosts
Create an SSH host

Then create the forwarded port. On your local machine you can select any port that’s not in use, in my case I use port 10000.

When I connect to my local machine on port 10000 the port is then securely forwarded to the remote machine on port 8332, which is the port the Bitcoin RPC server listens on by default.

So 127.0.0.1:10000 becomes BITCOIN_NODE:8332

Termius port forwarding
Forward a local port to the Bitcoin RPC port (8332)

The configuration page should look something like this.

Configuring Port Forwarding
Configuration pane for port forwarding

Open the port by clicking connect.

Connect the port forwarding
Connect the forwarded port

Python Bitcoin Library

To use python with your Bitcoin node use the python-bitcoinrpc library. To install simply use:

pip install python-bitcoinrpc

Next get the rpcuser and password you added to your bitcoin.conf file

rpcuser=thisismyuser
rpcpassword=DONT_USE_THIS_YOU_WILL_GET_ROBBED_ijfr84ur84uof94ur9r4

Once installed create a connection to the node using these credentials. The IP will always be locahost (127.0.0.1) and the port is the same port you used for the forwarding, 10000 in my case.

from bitcoinrpc.authproxy import AuthServiceProxy, JSONRPCException
USERNAME = ******
PASSWORD = ******
IP = "127.0.0.1:10000"
    
rpc_connection = AuthServiceProxy("http://{}:{}@{}".format(USERNAME, PASSWORD, IP), timeout = 500)

One connected we can query the node using regular RPC commands. Here I get the last 10 blocks and return the block height, timestamp, number of transactions in the block, difficulty and nonce.

bci = rpc_connection.getblockchaininfo()
maxBlock = bci["blocks"]
for i in range(maxBlock,maxBlock-10,-1):
    bbh = rpc_connection.getblockhash(i)
    bh = rpc_connection.getblockheader(bbh)
    print(bh["height"],bh["time"],bh["nTx"],bh["difficulty"],bh["nonce"])

Command output:

597577 1570042322 2866 12759819404408.79 872368408
597576 1570041887 2921 12759819404408.79 2413129693
597575 1570041233 3406 12759819404408.79 2989319068
597574 1570039252 2884 12759819404408.79 3248003543
597573 1570038909 3061 12759819404408.79 259424928
...

This command returns the network statistics of the node.

net = rpc_connection.getnettotals()
print(net)
{'totalbytesrecv': 9043069394, 'totalbytessent': 83507300429, 'timemillis': 1570047435410, 'uploadtarget': {'timeframe': 86400, 'target': 5242880000, 'target_reached': False, 'serve_historical_blocks': True, 'bytes_left_in_cycle': 3230191636, 'time_left_in_cycle': 55100}} 

For a full list of the currently available API calls see the Bitcoin Developer Reference.

Generating New Product Names using Neural Networks

So everyone knows Machine Learning / Artificial Intelligence / Cognitive Computing, call it what you will, is the new marketing catchphrase for people trying to sell their software products and services. You can be sure if it’s not already baked in then it’s in the roadmap for 2020.

It used to be ‘Big Data’, but we got tired of hearing that, so a few control+h presses later and, hey presto, Machine Learning (ML) has arrived.

Don’t get me wrong, I’m convinced ML will have a profound effect in the coming years, but like most technologies, we overestimate the short term effect and underestimate the long term.

As the saying goes, the future is already here β€” it’s just not very evenly distributed.

I read lots of articles on ML that seem fantastic but it’s hard to get a grasp on something when you haven’t really used it for yourself. I wanted to know if ‘ordinary’ people can use it, and what for? To satisfy my curiosity I decided to see if I could train a neural network to generate product names for clothing based on the product names we are already using in IC Group.

Getting Training Data

Data is the raw material for Neural Networks and the more data the better. If you’re data is already big then great! If not then don’t worry, you can still get interesting results.

To feed the network I extracted the entire history of style names of our three core brands, namely Peak Performance, Tiger of Sweden and By Malene Birger.

After cleaning the data to remove numbers and other ‘junk’ (for example Peak Performance often start style names with the abbreviation ‘JR’ for junior ), the raw data consisted of the following number of style names.

  • Peak Performance: 7,590
  • Tiger of Sweden: 13,087
  • By Malene Birger: 15,419

Not a huge corpus of data to go with but hopefully it should be enough to generate something of interest.

How Does This Thing Work?

The type of Neural Network I used is technically called a Recurrent Neural Network, or RNN for short. It essentially takes training data and ‘learns’ patterns in the data by feeding the data through layers. It also has some ‘memory’ (called LTSM or Long / short term memory!) so that as well as the input to the layer having influence it also selectively remembers or forgets the result of previous iterations.

For text this means you can feed the network large passages of text and the network will ‘learn’ how to write new text without knowing anything about grammar, spelling or punctuation. If you feed it all of Shakespeare’s works and train enough it will generate text that looks like real Shakespeare but is completely new work!

It may sound pretty complicated (and it is) but as a user you don’t really need to know much to get started. There’s ready-to-use scripts everywhere on the internet (Github + Google are your friends) that have full instructions. It’s very much plug and play and took me about an hour to get started from scratch.

I’ve also included links at the bottom of the article pointing to the code I used.

Our Current Product Names (The Training Data)

To give you an idea what types of product names we currently use I selected a few at random to give you a taste. Note that they are all short names (no more than 10 characters) and are not always ‘real’ words or even names.

Product names
A sample of our current product names

The names tend to have a Brand ‘feel’, so for example By Malene Birger use softer, slightly exotic sounding names to fit their Brand image and target consumer. It will be fun to see if the Neural Network can get this detail right.

Training the Network

This process is surprisingly simple. Just feed the network a text file with all the current names, one file per brand, then run the training script, sit back and get a coffee or three.

Neural Network Training
Neural Network Training

Since the training data is fairly small this doesn’t actually take very long (it took me a couple of hours per brand using a virtual machine) but is highly dependent on a handful of parameters that can be set plus the capabilities of your computer. Probably the most important parameters are these:

  • Number of layers in the network
  • RNN size, this is the number of hidden unit (or nodes) in the network
  • Training Epochs, basically how long to train the model for

Basically more layers, more nodes in the layers and longer training gives better results but can take much longer and the benefit isn’t always worth the effort. Trial and error often works just as well!

Does This Thing Really Work?

After training the model we simply use it to generate new names. This is called sampling the model, you can generate samples using some starting text but in my case I just let the model pick a random starting point.

So here’s a sample of the names generated per brand.

Neural network results
Names generated from the neural network

Bearing in mind that the network knows nothing about language I think it did a remarkably good job of capturing the essence of the brands names.

To emphasise once again, the network doesn’t know anything about the constructs of words, what vowels are or anything else for that matter. It learns these patterns purely from the training data and then builds a model to generate new words using the same rules.

The model can be sampled over and over again so there’s an unlimited supply of names.

Can Neural Networks be Creative?

If we really want to play around we can change the parameters of the sampling to try and generate more creative names.

One of these parameters (called temperature) basically tells the network how confident it should be about the name (actually how confident it should be about the next letter in the generated word). If we turn up the temperature the model becomes more aggressive and suggests ‘wilder’ names.

Neural network generated names
Some more exotic examples

I would definitely buy a blazer from Tiger of Sweden called JUGOMAR or maybe my girlfriend would like a dress from By Malene Birger called CIBBAN or some Peak Performance ski pants called RANDEN.

Of course if we turn up too much on the creativity then it starts to generate some nonsense!

Crazy neural network generated names
It’s starting to go crazy!

But even in the weirdness we get names like FLAURELAYKS and KAWLAN which I think sound like great product names πŸ˜ƒ

Summing Up

This was of course all done for fun, but it shows that these types of networks are not impossible to use and someone with decent computer skills can get these up and running in a matter of hours.

If ML really is going to explode in the coming years then they will need to be easier to interact with than they are today. There will never be enough data scientists to satisfy demand, so just like spreadsheet programs made everyone a numbers whizz I expect user interfaces and APIs will be developed so less skilled users can create, train, and deploy ML models into production.

It Almost Makes Sense

As a final challenge I tried making new product descriptions by training the model on current descriptions. It almost makes sense but could maybe do with a bit more training πŸ˜‰

This is one for Peak Performance!

Stylish Mid feel shortany ski town, it with a shell is a fixent windproof, comfortable, keeping this fit delivers the wicking, breathable Joad.

References If You Feel Inspired To Try Yourself!

If you feel like reading more or even trying for yourself then the code for the RNN is available to download here.

https://github.com/jcjohnson/torch-rnn

And more general reading on generating text using an RNN is here.

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Getting Database and Table Sizes in Postgres

Total Database Size

This SQL simply gets the total size of the database in a human readable format.

SELECT pg_size_pretty(pg_database_size('postgres')) as db_size

List all Tables

This lists all the tables in the database public schema.

SELECT tablename FROM pg_catalog.pg_tables WHERE schemaname = 'public'

Search Schema for Column Name

I often need to search all the tables or views to find which contain a particular column. Replace ‘COLUMN_NAME’ with your column below.

SELECT t.table_schema,t.table_name
FROM information_schema.tables t
INNER JOIN information_schema.columns c 
      ON c.table_name = t.table_name 
      AND c.table_schema = t.table_schema 
WHERE c.column_name = 'COLUMN_NAME'
      AND t.table_schema not in ('information_schema', 'pg_catalog')
      AND t.table_type = 'BASE TABLE'
ORDER BY t.table_schema;

In this case I searched for all columns containing the word ‘order’.

Table Sizes

Retrieve the size per table in the public schema from largest to smallest.

SELECT nspname || '.' || relname AS "table_name",
        pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
     WHERE nspname = 'public'
     AND C.relkind <> 'i'
     AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC

Full Schema

SELECT * FROM information_schema.columns WHERE table_schema = 'public'

Setting Up a Local Blockchain with Ganache

Blockchain graphic

Why would I want to do this?

Interacting with Blockchains and blockchain technology probably seems for most people like a very complex task. How to even get started? Don’t they run on some servers spread across the globe? How would I make a transaction and see the result? Wouldn’t I use to use real money to do this?

If you want to play around with using Blockchain technology but don’t know how to get started a great way is to run a local test blockchain on your own computer. It’s easy to set up, carries no risk of losing your own money, gives you immediate insight into what’s happening and can just be reset at any moment so you can try over and over again.

We’ll use two applications to get started:

  • Ganache
  • MetaMask

From start to finish it should take no longer than 30 minutes βŒ›

Ganache – A Personal Blockchain

The absolute easiest way to get started is by using Ganache. Ganache is a personal Ethereum blockchain running on you own computer that’s incrediby easy to install and get running. It’s basically a virtualized blockchain application.

It’s available for Windows, Mac and Linux, just download the installer, double click to install and run. Takes 5 minutes to get started.

MetaMask

Once Ganache is installed you need a way to interact with the blockchain. There are many applications available to do this but the easiest is probably MetaMask. It’s a browser extension that supports Chrome, Brave, Firefox and Opera plus has iOS and Android apps in beta. Follow the directions on the site to install and create an account.

We will use MetaMask to connect to our local blockchain server so we can add accounts and send test transactions between the accounts. These transactions we will then be able to see in Ganache.

To install MetaMask get it from the Chrome Web Store and follow the instructions to create an account.

Connect MetaMask to Ganache

Assuming you have now installed Ganache and MetaMask we need to connect the applications. First run Ganache and select the Quickstart option. This uses the default settings and get us up and running.

Starting Ganache
Ganache starting view

Ganache will now create a test blockchain and some test accounts which by default have 100 ETH (test ETH of course) in each. You can see the accounts below along with their public addresses, balance and transaction counts. That’s all there is to getting the test blockchain 🀜

Ganache application
Gamache accounts view

Now that Ganache is running we need to connect it to MetaMask. Open MetaMask and log into your account.

MetaMask login

To make working in MetaMask easier you can click on the more menu and choose ‘Expand View‘ to open it full screen.

Expand to fullcreen

To connect MetaMask to our local blockchain we need to change a few setting in MetaMask. First click on the network name at the top and select ‘Custom RPC

Change MetaMask network

Here we add the details for our local blockchain. If you look in the header of Ganache you can see the server details we will use.

Ganache RPC server settings

Call the network anything you want, the URL must be http://127.0.0.1:7545 as Ganache is running on port 7545 on localhost and leave the rest blank.

Once you click save you are now connected to the Ganache blockchain, although right now there’s not much to see. To really see what’s going on we need to add accounts to MetaMask.

Adding Accounts

Returning to Ganache choose one of the accounts to add and click on the key symbol. This will allow us to see the private key of the account. Obviously this is not something you would normally be able to do since private keys are, by their nature, private.

Show the private key

Copy the private key address from the next screen.

Copied private key

Returning to MetaMask, click on the circle logo and select ‘Import Account

Import accounts

Make sure the type is Private Key and then paste in the private key you copied from Ganache.

You’ll see the account is imported with a balance of 100 ETH which matches what we saw in Ganache. You can edit your account name by clicking on the name and changing it in the dialog that opens.

Imported Account

Creating Transactions

Now we’re finally ready to start interacting with the blockchain and creating transactions.

First in Ganache choose an account you wish to send ETH to and copy the address.

Recipient Address

Back in MetaMask click the Send button as shown here.

Sending test ETH

In the ‘Add Recipient’ field paste the account address you just copied from Ganache, choose the amount of ETH you wish to send and pick a Transaction Fee. Because this is a test network the fee is irrelevant as the blocks are mined automatically but normally this fee controls how the transaction will be prioritized by miners.

Send options

You’ll now see in MetaMask the transaction and remaining balance (100 – 5 – transaction fee)

Transaction in MetaMask

The same can be seen in Ganache, so the sending account is debited with 5 ETH and the receiving account credited with 5 ETH. The transaction fee has been used by the blockchain to create the block containing the transaction.

Ganache account balances
New balances in Ganache

Selecting the transactions tab in Ganache gives you a view of all the transactions made so far.

Ganache transactions
Blockchain Transactions

Selecting the Blocks tab in Ganache gives a view of the blocks mined. So far you can see one block was automatically created when we started Ganache. This is known as the Genesis block and forms the head or root of the blockchain. Our transaction created a new block linked to this initial block. Every block created after the initial block is mathematically linked back to the previous block and so on all the way back to Genesis block (block 0).

Generated Blocks