DIGOO DG-HOSA – Part 2 Firmware Extraction and Initial Analysis

This is a continuation from a previous post: https://ben.the-collective.net/2019/08/21/digoo-dg-hosa-part-1-teardown-and-hardware/

Finding the connections

Now that I have the lay of the land for the device (which that I outlined in my previous part of the series) the first thing I looked for is the debugging connections for the main GigaDevices processor. This processor looks to be the primary processor for the device and has the most valuable firmware. Since the board was well labeled I didn't need to use any tools like a JTAGulator or an Arduino board with the JTAGenum firmware to identify which test points are the debug interface. I was able to find the SWDIO, SWCLK, +3.3 and GND connections for the Serial Wire Debug (SWD) debug interface. This is the same interface that STM32 chips utilize and it provides similar functionality as a "standard" JTAG interface.

Serial Wire Debug (SWD) is a 2-pin (SWDIO/SWCLK) electrical alternative JTAG interface that has the same JTAG protocol on top. SWD uses an ARM CPU standard bi-directional wire protocol, defined in the ARM Debug Interface v5. This enables the debugger to become another AMBA bus master for access to system memory and peripheral or debug registers.

https://www.silabs.com/community/mcu/32-bit/knowledge-base.entry.html/2014/10/21/serial_wire_debugs-qKCT

In the image below you can see the debug test points along with the with wires soldered to them to connect to my debugger. The proximity of these test points to the GD32F105 processor, it is a good assumption that they are for that chip.

As a bonus also pictured is my wire soldered around the switch on the upper left to bypass the intrusion detection function.

For this project, I soldered wires to most of the test points across the board. This board has a ton of test points that maybe be useful to monitor signals over the course of this project. To manage the wiring for all of the test points on this project I created a test jig to keep the setup organized. The next picture shows my test setup.

The firmware extraction setup

This jig was inspired by some tweets long ago by cybergibbons where he recommended doing something similar. Once all of the test wires were in place, I hooked up my ARM debugger of choice the Black Magic Probe (BMP) from 1BitSquared and the process to started to extract the firmware.

Initially, I tried to power the board using the BMP but I found that the BMP was not able to provide enough power to the board to support the minimum number of peripherals. The BMP can only supply 100mA of power. Some lights would come on but gdb would not detect any devices connected. I ended up adding the USB connection you see in the photo to provide more power to the board.

Now that everything is powered and connected I was able to use gdb to attach to the board and dump the firmware of the device.

Extracting the firmware: gdb

The first step is to attach my local arm gdb build to the Blackmagic Probe which acts as a remote gdb server. I always find the Useful GDB commands wiki page in the BMP wiki to be very useful in refreshing my memory. The syntax and terminal output I started with are:

╭─locutus@theborgcube ~/Projects/RE-Digoo_DG-HOSA
╰─$ arm-none-eabi-gdb -ex "target extended-remote /dev/tty.usbmodemC2D9BBC31"
 GNU gdb (GNU Tools for ARM Embedded Processors) 7.10.1.20160616-cvs
 Copyright (C) 2015 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
 and "show warranty" for details.
 This GDB was configured as "--host=x86_64-apple-darwin10 --target=arm-none-eabi".
 Type "show configuration" for configuration details.
 For bug reporting instructions, please see:
 http://www.gnu.org/software/gdb/bugs/.
 Find the GDB manual and other documentation resources online at:
 http://www.gnu.org/software/gdb/documentation/.
 For help, type "help".
 Type "apropos word" to search for commands related to "word".
 /Users/locutus/.gdbinit:1: Error in sourced command file:
 No symbol table is loaded.  Use the "file" command.
 Remote debugging using /dev/tty.usbmodemC2D9BBC31
 (gdb) monitor
 Black Magic Probe (Firmware v1.6.1-1-g74af1f5) (Hardware Version 3)
 Copyright (C) 2015  Black Sphere Technologies Ltd.
 License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
 (gdb) monitor swdp_scan
 Target voltage: 3.3V
 Available Targets:
 No. Att Driver
  1      STM32F1 high density
 (gdb) attach 1
 Attaching to Remote target
 0x08007b46 in ?? ()
 (gdb) dump binary memory firmware.bin 0x08000000 0x080FFFFF
 Cannot access memory at address 0x8080000

When I ran into the error at the end of the terminal output I was a bit confused until I looked at this memory layout of the chip in the datasheet and saw that I was overrunning the size of the first flash memory bank.

datasheet

After I adjusted the GDB dump command...

(gdb) dump binary memory firmware.bin 0x08000000 0x0807FFFF
(gdb)

...success!

╭─locutus@theborgcube ~/Projects/RE-Digoo_DG-HOSA
╰─$ ls -l firmware.bin
 -rw-r--r--  1 locutus  staff  524287 Nov 16 14:13 firmware.bin

I now have a copy of the firmware we can do some initial analysis of it.

Initial Analysis

First thing first like with any binary I start by running strings to get some hints on the contents of the binary and make sure it is a valid dump. I found a ton of strings showing this is a valid dump of the firmware, most notably the same markings on the board showing up in the firmware:

PCB:PG-103 VER2.3/FIRMWARE: 103-2G-J

and other strings indicate that they are using the Real-Time Operating system (RTOS) OS-III (link2) as the operating system. The Micrium site does not specifically list the Gigadevices chip in the supported just the general ARM Cortex-M3 cores as supported.

Seeing this let me know that reversing this firmware will be much more complex then I had hoped. The RTOS will add a lot of scheduling and random functions to look into. After this initial investigation, it is time to load the firmware into Radare. I used the following command when loading it up:

r2 -a arm -b 16 -m 0x0800c000 firmware.bin

This syntax sets the proper processor (-a) and CPU register size (-b) and starting memory location (-m). Once loaded I run an initial analysis job to see what Radare finds.

[0x0800c000]> aaa
 [x] Analyze all flags starting with sym. and entry0 (aa)
 [x] Analyze function calls (aac)
 [x] find and analyze function preludes (aap)
 [x] Analyze len bytes of instructions for references (aar)
 [x] Check for objc references
 [x] Check for vtables
 [x] Finding xrefs in noncode section with anal.in=io.maps
 [x] Analyze value pointers (aav)
 [x] Value from 0x0800c000 to 0x0808bfff (aav)
 [x] 0x0800c000-0x0808bfff in 0x800c000-0x808bfff (aav)
 [x] Emulate code to find computed references (aae)
 [x] Type matching analysis for all functions (aaft)
 [x] Use -AA or aaaa to perform additional experimental analysis.

[0x0800c000]> afl |wc -l
      844

Radare found 844 functions without any hints or adjustments. In some of the work I have already done, there are even more than 844 functions. Now that I have a copy of the firmware, I've dived in and started analyzing the firmware which as of writing is still a work in progress. As I get further along I will cover some of the techniques I am using to take apart this firmware.

Enabling old TLS / SSL ciphers in OpenSSL

I was reminded of this tip during the CTF at a recent DC207 meetup. This config change is needed on machines with modern versions of OpenSSL that have disabled the older ciphers. The issue is that the old TLS, SSL and associated cipher suites have become insecure and support is subsequently dropped in OpenSSL.

For a workaround to this, you can edit the following lines at the bottom of /etc/ssl/openssl.cnf

[system_default_sect]
 MinProtocol = TLSv1
 CipherString = DEFAULT@SECLEVEL=1

It may be required to comment out similar lines in the config if they already exist.

My OSCP Experience

What is the OSCP

Offensive Security Certified Professional (OSCP) is an entry-level hands-on penetration testing certification. The OSCP is one of a few certifications by Offensive Security. It consists of the self-study Penetration Testing Training with Kali Linux (PwK) class and an online proctored practical exam.

The course costs at minimum $800 USD and includes 30 days of lab access and one OSCP exam attempt. There are packages that include longer lab access and you can extend your lab access if you find you need longer to prepare.

What ISN’T the OSCP

  • Current methods and techniques
  • It won’t make you a l33t hax0r, but you will learn fundamentals

How long did you study?

I started working on it on Sept 2018, then life and the holidays got in the way of dedicated study time. I kept slowly and intermittently practicing until April 2019 when I REALLY started to get serious about completing the OSCP. This started crunch time. I am lucky that my partner was on board with me locking my self away to focus on labbing. I took the exam on May 9th 2019.

How did you do to study?

I started by going through both the Offensive Security’s Penetration Testing with Kali Linux (PwK) workbook and then watching the associated videos. They are both fantastic resources providing a solid base of knowledge you need for the exam. I had the printed out the PwK workbook printed out and bound to save my eyes from staring at a screen. Through all my studies, I took a lot of notes. I used these notes when working on machines in the lab, exam, and other CTF style boxes I worked. Below are copies of the notes I created while studying.

Once I completed the workbook and videos, it was time to sit down and start to work on machines in the Lab. While working on the labs I began to branch out and gather and learn from various sources across the internet. As I worked through the lab and got closer to my date, I started to focus on my weak topics for me that were Windows Exploitation and Windows Privilege Escalation. I have added some of the main links and books I used to study, there are many more links in my notes.

Links

Books

  • Penetration Testing: A Hands-On Introduction to Hacking - Georgia Weidman
  • The Hacker Playbook: Practical Guide To Penetration Testing - Peter Kim
  • The Hacker Playbook 2: Practical Guide To Penetration Testing - Peter Kim
  • Hacking: The Art of Exploitation - Jon Erickson

OMG the Exam…

The OSCP exam is a practical test that is 24 hours of hacking in a mock environment attempting to break into various targets. You will then have another 24 hours to write a report based on your findings from the exam. To obtain your OSCP you must submit a report I'll talk more about the report later. The Exam is proctored, you will run software that will capture your screen and webcam, both of which will also be monitored by one or more proctors. There are limits to the tools you can during the Exam:

Spoofing (IP, ARP, DNS, NBNS, etc)
Commercial tools or services (Metasploit Pro, Burp Pro, etc.)
Automatic exploitation tools (e.g. db_autopwn, browser_autopwn, SQLmap, SQLninja etc.)
Mass vulnerability scanners (e.g. Nessus, NeXpose, OpenVAS, Canvas, Core Impact, SAINT, etc.)
Features in other tools that utilize either forbidden or restricted exam limitations
You are limited to use Metasploit once during the lab

https://support.offensive-security.com/oscp-exam-guide/

These limitations are an example of why it is important to fully read through the exam guide and reporting template to make sure you have all the proofs and meet the reporting requirements. These guides are found at the following links:

Lab and Exam Reporting Info: https://support.offensive-security.com/pwk-reporting/
OSCP Exam Guide: https://support.offensive-security.com/oscp-exam-guide/
Proctoring FAQ: https://support.offensive-security.com/proctoring-faq/

My exam agenda

When planning for my Exam I created a high-level schedule to follow. This is an important and way for me to get organized. My exam started at 9:00 am allowing me to follow a similar routine to what I do normally.

  • Wake up … Breakfast
  • Connect to Proctor and follow preocess - 15 mins before start
  • Receive access details and connect to VPN - 15 mins
  • Read requirements and write down in notes - 30 - 45 mins
  • Initial Enumeration of targets - 1 hour
  • Hack Away!
  • Eat Lunch
  • Hack…
  • Eat Dinner
  • Probably still Hack…..

Exam Tips and Tactics

This is a list of various mostly non-technical tips I have for when taking the Exam. When reading through people's challenges on Reddit, Twitter and Blog posts I saw a lot of people ran into less than technical issues when taking their Exams.

  • I'll repeat this here make sure you read through the exam guide and reporting template to make sure you have all the proofs and meet the reporting requirements!
  • Attempt to limit distractions and find ways to go into flow
  • Manage your Time Management wisely
    • I used Pomodoro to help divide up my day. This method is ~25 minutes working, take a 5-minute break, repeat. I changed targets on each cycle if I was not making progress and was just grinding away on a machine. This method helped me getting stuck on one machine for extended periods of time.
  • Keep a timeline of the day
    • This will help you reference and screenshots or recordings you created later.
  • You are your own worst enemy: Avoid going down a rabbit hole
    • Breath…go for a walk…pet a cat…Have a snack...
  • Enumerate Enumerate Enumerate
    • If you are not finding your way into a system or the way to escalate privilege, enumerate more.
  • Screenshot, Screen record, track everything! This will take the stress off of creating the report the next day.

Reporting

There are two topics when it comes to reporting there is the Lab report and the Exam report. Offensive Security provides a guide for reporting at the following URL: https://support.offensive-security.com/pwk-reporting/. This contains some templates and some recommendations on how to manage data.

One of the first questions people ask is if I did the Lab report. I decided not to do the Lab report, it only worth 5 points, and I did not find that the time to create the report was worth it for me. However, I did write a mock report to practice ahead of the Exam. The made sure that my first Exam reporting experience was not during the Exam when I would be exhausted.

When it comes to my Exam report, I started my report after I had finished my Exam but has not closed out with my proctor and start to create a very very very rough document with the screenshots and other content. I did this to make sure I had satisfied all of the requirements and it would let me go back and recreate or regather any Proofs I may have missed. After I thought I had everything and the adrenalin had started to wear off I went to sleep and got started the next day and finished the document throughout the next day.

In Closing

  • The OSCP was a great experience and very challenging
  • There is a lot to learn
  • Make sure significant people in your life understand the time commitment
  • ABL, Always Be Labbing
  • Have fun, good luck, and #tryharder

Link: Exploring Key Features of Cisco ISE Release 2.6

In July I wrote for the CDW blog about the new version of the Cisco Identity Services Engine (ISE) software.

Exploring Key Features of Cisco ISE Release 2.6

The latest version of this cybersecurity tool offers unique device identification and an IoT protocol.

DIGOO DG-HOSA – Part 1 (Teardown and Hardware)

Banggood page

This project started with the idea of purchasing a cheap security system off one of the Chinese stores. After a little hunting, I found Digoo DG HOSA 433MHz 2G&GSM&WIFI Smart Home Security Alarm System Protective Shell Alert with APP which looked interesting so picked one up to tear apart. I was curious about how various communication methods were implemented.

This is the first part of this adventure the next part will be exploring the firmware of the device. With that let's take a look at the hardware.

Teardown Time

After the device showed up, I quickly got down to taking the device apart. In my haste, I didn't take many good photos of it intact. The front side of the board is straight forward; it contains the screen, button array for all user input, and a lot of useful test points. The front side is pictured below.

Board Front

The most significant information found on the front side of the board is the notation PG-103, which is also found in the firmware (spoiler). After some searching, I found this device is also branded as the PGST PG-103. This kind of rebranding of hardware is not unusual for a lot of Chinese devices.

Now switching to the back of the board, which is the business side of the board with the main chips and modules providing the various communication methods. When opening that device I encountered the intrusion detection button. This button causes the device to go into an alarm mode and require a reset of the device to come back online. For my testing, I bypassed this button bridging both sides of it.

Back of Board
Board Back

Component List

When inspecting the board, I found a few significant components and modules on the board. I was not surprised to see that most of the major communication parts are off the shelf modules. The components listed below are highlighted in the image above and the relevant data sheets where available are linked.

The main processor is a GigaDevice GD32 chip which is a series that is very similar to the of STMicroelectronics STM32 chips. The GD32F105 chip uses an ARM-based instruction set and has the same pinout as the STM32F105 component.

Block Diagram

The high-level block diagram for the device is pretty straight forward. The GD32F105 chip is the primary processing and control of the external communication modules. This allows for a modular architecture all of the peripherals.

 +-----------------------+
 |  Cellular             +-----------+
 |  Quictel M26          |           |
 +-----------------------+           |
 +-----------------------+  +--------+-------+
 |  WIFI                 +--+   CPU          |
 |  HF-LPB120-1          |  |   GD32F105RCT6 |
 +-----------------------+  +--------+-+-----+
 +-----------------------+           | |
 |  433mhz receiver      |           | |
 |  SYN511R              +-----------+ |
 +-----------------------+             |
 +-----------------------+             |
 |  Keypad Controller    +-------------+
 |  Holtek BS83B16A-3    |
 +-----------------------+

Pin Out

When exploring the board there are many test points on the board and tracing them out I was able to trace out most of the pins to where they connect on the controller.

  • SYN515R Pin 10 (DO) -> CPU PB9 (62)
  • Unknown -> CPU PA5
  • Unknown -> CPU PA6
  • Unknown -> CPU PA8
  • U7 SCL -> Unknown
  • U7 SDA -> Unknown
  • DAC_OUT -> CPU PA4 (20)
  • WIFI UART TX -> CPU PA2 (16)
  • WIFI UART RX -> CPU PA3 (17)
  • GSM UART TX -> CPU PA12 (45)
  • GSM UART RX -> CPU PA13 (46)
  • U1 (F117) Pin 6 -> CPU PB 8

Summary?

After investigating the hardware I was able to extract the firmware and start the reversing process. I will cover what I have found in future posts. For now, if you are interested in more higher resolution photos of the board I have posted them on my Flickr account.

BSidesNH 2019 Recap

Badge

Back on May 18th, I attended the inaugural BsidesNH event. It was a fantastic one-day event. The day started pretty early for me driving down from Maine arriving at Southern NH University. I arrived to pick up the fantastic badge made out of an old 3.5" disk. After grabbing some coffee and a snack I settled into the auditorium and for a day of great talks. There were a few that stood out to me from the day that I will talk about.

The second talk of the day was Ghost in the Shell: When AppSec Goes Wrong by Tony Martin. Tony first talked about covered some basics of web application security. He framed these issues around the research he has done into various NAS devices and vulnerabilities he has discovered. Including the ability to create shadow users that have administrative access to devices but are not visible through the administrative interfaces of the device.

After lunch was Chinese and Russian Hacking Communities presented by Winnona DeSombre and Dan Byrnes, Intelligence Analyst from Recorded Future. They covered operations and cultures of Chinese and Russian underground groups. This was a very entertaining presentation and a summary of the information contained in the report: Thieves and Geeks: Russian and Chinese Hacking Communities.

The second to last talk of the day was Hunting for Lateral Movement: Offense, Defense, and Corgis presented by Ryan Nolette. He covered the ways attackers move around and infiltrate further into a network...Corgies. A great quote that stuck with me from his talk was: “If you teach an analyst how to think they will punch above their weight.” I feel this quote not only applies to security analysts but all levels of IT professionals.

BsidesNH was a well run and enjoyable event and a great addition to the Security events in New England. Thanks to all of the organizers and sponsors. I look forward to attending next year!

Hashcat in AWS EC2

Intro

During my OSCP studies, I realized I needed a more efficient system for cracking password hashes. The screaming CPU fans and high CPU usage became a problem. I first tried using hashcat and the GPU on my MacBook Pro in OS X. There are some bugs and problems with hashcat on OS X that would make it crash in the middle of cracking a hash. Also, I was not interested in investing a server with a bunch of GPUs, the high costs to do this would outweigh the amount of time I need the system. All of this lead me to do a little research and found the instructions in the following link to build an AWS instance for password cracking.

https://medium.com/@iraklis/running-hashcat-v4-0-0-in-amazons-aws-new-p3-16xlarge-instance-e8fab4541e9b

Since that post was created there have been some changes to the offerings in AWS EC2 leading me write this post.

If you wish to skip ahead I have created scripts to automate the processes in the rest of this post. They are both in my github and can be downloaded at the following links.

https://github.com/suidroot/AWSScripts/blob/master/aws-ec2-create-kracker.sh
https://github.com/suidroot/AWSScripts/blob/master/configure-kracker.sh

For the rest of the article I will cover some of the instance options in EC2, installation of the needed Linux packages, the basic setup of Hashcat, running Hashcat, and finally monitoring and benchmarks of an EC2 instance.

AWS EC2 Options

There are many options for EC2 instances, they have a huge range in cost and scale.

I found the g3 instances to be the more cost effective tier. For my testing I opted to use the g3.4xlarge tier. Next to choose the AMI image, appropriate the appropriate operating system.

AMI images

There are two options that are I tested hashcat on they are both Ubuntu based. I’m sure there are many other available options that will work too, but I am familiar with Ubuntu systems. The first option is a standard Ubuntu image, there is nothing special about this image and it requires configuration to add the GPU drivers and a little more work.

Standard Ubuntu

The next option is a Deep Learning image, this image is preconfigured with the GPU drivers and was originally designed for machine learning applications. I found the the pre-configuration allowed for me skip a few steps in building out a new system.

Deep learning Ubuntu GPU driver preloaded

Instance Build and config

Once you have the instance deployed there are a few steps to get the Instance prepared for hashcat, the steps are a little bit different between a Standard and a Deep Learning Ubuntu instance.

An apt cronjob may already be running and you will have to wait it out.

Prepare Machine (Standard Ubuntu)

This script will install all the required packages and the Nvidia GPU drivers on a vanilla Ubuntu installation.

#!/bin/bash

# mostly copied from: https://medium.com/@iraklis/running-hashcat-v4-0-0-in-amazons-aws-new-p3-16xlarge-instance-e8fab4541e9b
#
sudo apt-get update -yq
sudo apt-get install -yq build-essential linux-headers-$(uname -r) unzip p7zip-full linux-image-extra-virtual
sudo apt-get install -yq ocl-icd-libopencl1 opencl-headers clinfo
#sudo apt-get install -yq libhwloc-plugins libhwloc5 libltdl7 libpciaccess0 libpocl2 libpocl2-common ocl-icd-libopencl1 pocl-opencl-icd
sudo apt-get install -yq python3-pip 
pip3 install psutil

sudo touch /etc/modprobe.d/blacklist-nouveau.conf
sudo bash -c "echo 'blacklist nouveau' >> /etc/modprobe.d/blacklist-nouveau.conf"
sudo bash -c "echo 'blacklist lbm-nouveau' >> /etc/modprobe.d/blacklist-nouveau.conf"
sudo bash -c "echo 'options nouveau modeset=0' >> /etc/modprobe.d/blacklist-nouveau.conf"
sudo bash -c "echo 'alias nouveau off' >> /etc/modprobe.d/blacklist-nouveau.conf"
sudo bash -c "echo 'alias lbm-nouveau off' >> /etc/modprobe.d/blacklist-nouveau.conf"

sudo touch /etc/modprobe.d/nouveau-kms.conf
sudo bash -c "echo 'options nouveau modeset=0' >>  /etc/modprobe.d/nouveau-kms.conf"
sudo update-initramfs -u
sudo reboot

### Install nVidia Drivers
wget http://us.download.nvidia.com/tesla/410.104/NVIDIA-Linux-x86_64-410.104.run
sudo /bin/bash NVIDIA-Linux-x86_64-410.104.run --ui=none --no-questions --silent -X

Prepare Machine (Deep Learning Ubuntu)

In comparison the previous script there is a much simpler script to prepare the Deep Learning instance. The main focus is installing the needed archive extraction tools.

#!/bin/bash

sudo apt update
sudo apt upgrade
sudo apt install clinfo unzip p7zip-full
sudo apt install build-essential linux-headers-$(uname -r) # Optional 
sudo apt-get install -yq python3-pip 
pip3 install psutil

Hashcat Setup

Now we need to download and extract the star of the show Hashcat. The link in the wget below points to the the most recent version as of writing however you might want to check to see if there is a more recent version at the main site: https://hashcat.net/hashcat/

wget https://hashcat.net/files/hashcat-5.1.0.7z
7z x hashcat-5.1.0.7z

Download wordlists

You will need some wordlists for hashcat to use to crack passwords, he commands listed are for some wordlists I like to use when cracking. You should however add whichever lists are your favories.

mkdir ~/wordlists
git clone https://github.com/danielmiessler/SecLists.git ~/wordlists/seclists
wget -nH http://downloads.skullsecurity.org/passwords/rockyou.txt.bz2 -O ~/wordlists/rockyou.txt.bz2
cd ~/wordlists
bunzip2 ./rockyou.txt.bz2
cd ~

Running hashcat

Now it is time to run hashcat and crack some passwords. When running hashcat I had the best performance with the arguments-O -w 3. Below is an example command line I've used inclusing a rules file.

./hashcat-5.1.0/hashcat64.bin --username -m 1800 ./megashadow256.txt wordlists/rockyou.txt -r hashcat-5.1.0/rules/best64.rule -O -w 3

Monitoring the Nvidia GPU

The nvidia-smi utility can be used to show the GPU processor usage and what processes are utilizing the GPU(s). The first example is is showing an idle GPU.

ubuntu@ip-172-31-17-6:~$ sudo nvidia-smi
Fri Apr 26 14:43:49 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M60           Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   37C    P0    42W / 150W |      0MiB /  7618MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

This example shows a GPU being used by hashcat.

ubuntu@ip-172-31-17-6:~$ sudo nvidia-smi
Fri Apr 26 14:44:44 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M60           Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   46C    P0   141W / 150W |    828MiB /  7618MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     11739      C   ./hashcat-5.1.0/hashcat64.bin                817MiB |
+-----------------------------------------------------------------------------+

Conclusion and Benchmarks

Finally here is a benchmark I ran on a g3.4xlarge instance. This instance type contains 1 GPU. These results give an idea of performance for this AWS EC2 instance type.

ubuntu@ip-172-31-17-6:~$ ./hashcat-5.1.0/hashcat64.bin -O -w 3 -b
hashcat (v5.1.0) starting in benchmark mode...

* Device #2: Not a native Intel OpenCL runtime. Expect massive speed loss.
             You can use --force to override, but do not report related errors.
nvmlDeviceGetFanSpeed(): Not Supported

OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: Tesla M60, 1904/7618 MB allocatable, 16MCU

OpenCL Platform #2: The pocl project
====================================
* Device #2: pthread-Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, skipped.

Benchmark relevant options:
===========================
* --optimized-kernel-enable
* --workload-profile=3

Hashmode: 0 - MD5

Speed.#1.........: 11611.6 MH/s (90.74ms) @ Accel:512 Loops:512 Thr:256 Vec:4

Hashmode: 100 - SHA1

Speed.#1.........:  4050.2 MH/s (65.01ms) @ Accel:512 Loops:128 Thr:256 Vec:2

Hashmode: 1400 - SHA2-256

Speed.#1.........:  1444.5 MH/s (91.98ms) @ Accel:256 Loops:128 Thr:256 Vec:1

Hashmode: 1700 - SHA2-512

Speed.#1.........:   499.4 MH/s (66.78ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)

Speed.#1.........:   189.8 kH/s (42.76ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Hashmode: 1000 - NTLM

Speed.#1.........: 18678.1 MH/s (56.58ms) @ Accel:512 Loops:512 Thr:256 Vec:2

Hashmode: 3000 - LM

Speed.#1.........: 10529.6 MH/s (50.60ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........: 10650.8 MH/s (49.60ms) @ Accel:512 Loops:256 Thr:256 Vec:1

Hashmode: 5600 - NetNTLMv2

Speed.#1.........:   829.3 MH/s (80.24ms) @ Accel:256 Loops:64 Thr:256 Vec:1

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

Speed.#1.........:   442.0 MH/s (37.81ms) @ Accel:4 Loops:1024 Thr:256 Vec:1

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:  4209.1 kH/s (51.39ms) @ Accel:1024 Loops:500 Thr:32 Vec:1

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:     7572 H/s (33.02ms) @ Accel:16 Loops:4 Thr:8 Vec:1

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:    76958 H/s (83.99ms) @ Accel:512 Loops:128 Thr:32 Vec:1

Hashmode: 7500 - Kerberos 5 AS-REQ Pre-Auth etype 23

Speed.#1.........:   149.4 MH/s (56.00ms) @ Accel:128 Loops:64 Thr:64 Vec:1

Hashmode: 13100 - Kerberos 5 TGS-REP etype 23

Speed.#1.........:   152.1 MH/s (55.00ms) @ Accel:128 Loops:64 Thr:64 Vec:1

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:    32703 H/s (84.02ms) @ Accel:256 Loops:64 Thr:256 Vec:1

Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 7999)

Speed.#1.........:    21692 H/s (96.24ms) @ Accel:256 Loops:128 Thr:32 Vec:1

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 35000)

Speed.#1.........:     5940 H/s (40.09ms) @ Accel:64 Loops:32 Thr:256 Vec:1

Hashmode: 11600 - 7-Zip (Iterations: 524288)

Speed.#1.........:     4522 H/s (55.87ms) @ Accel:256 Loops:128 Thr:256 Vec:1

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:    18001 H/s (56.74ms) @ Accel:4 Loops:16384 Thr:256 Vec:1

Hashmode: 13000 - RAR5 (Iterations: 32767)

Speed.#1.........:    18135 H/s (55.93ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Hashmode: 6211 - TrueCrypt PBKDF2-HMAC-RIPEMD160 + XTS 512 bit (Iterations: 2000)

Speed.#1.........:   121.7 kH/s (59.39ms) @ Accel:128 Loops:32 Thr:256 Vec:1

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 6000)

Speed.#1.........:    68380 H/s (158.89ms) @ Accel:512 Loops:256 Thr:32 Vec:1

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 500)

Speed.#1.........:  1088.7 kH/s (48.51ms) @ Accel:128 Loops:62 Thr:256 Vec:1

Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 199999)

Speed.#1.........:     2107 H/s (78.97ms) @ Accel:128 Loops:64 Thr:256 Vec:1

Started: Fri Apr 26 14:36:56 2019
Stopped: Fri Apr 26 14:42:03 2019

If you've made it this far congratulation and happy cracking!