Debugger support for Greengrass

NOTE: Currently this content has only been tested on Greengrass V1 but will be updated for Greengrass V2 soon

Attaching a debugger to a process running on a Greengrass Core is something I dreamed of doing since the first release of Greengrass V1 years ago. I've managed to pull it off in a few different ways but here I'll share the most recent methods I've used to do it for Java, NodeJS, and Python 3.

Java

TODO

NodeJS

TODO

Python

Python debugging is implemented with pydevd-pycharm.

For Python 3 I replaced /usr/bin/python3.7 with the script below. Here's the high level overview of what it does:

  • Adds in debug shim code into a Python function
  • Opens a server on localhost that waits for a connection from a client
  • When the client connects to the server (telnet, netcat, etc) the server asks it for a port number
  • The client enters the port number that its debug server is running on
  • The server tries to connect Python to the client's debug server on that port on 127.0.0.1. This requires the client to be running on the host or to port forward the host's local port to its debug port.

The debug code was designed like this since the Python debugger I used runs as a server as opposed to a client. Instead of attaching to a process like gdb or Java Debug Wire Protocol JDWP the process reaches out to the debugger.

In detail it does this:

  • Removes /usr/bin from the process's PATH environment variable. This is so it avoids recursively invoking itself.
  • Checks to see if the DEBUG_PORT variable is set. If it isn't set it sets it to a random number between 2000 and 9999 so it avoids privileged ports.
  • Attempts to find the real Python executable. If it cannot find Python it reports an error and exits with error code 1.
  • Checks to see if the DEBUG variable is set. If the variable is not set then it simply runs Python normally.
  • If the DEBUG variable is set it does the following:
    • Pipe the debug shim code into a Python interpreter
    • Pass the Python interpreter the command-line options passed from Greengrass (usually just the script name)
    • Start the interpreter
  • The debug shim code does the following:
    • Print that the debug code has been loaded along with the port number that it is listening on
    • Binds to the port number on 127.0.0.1
      • NOTE: Remote debugging is achieved by using an SSH tunnel and port forwarding the debugger to the client. However, this can be skipped in a closed environment by binding to a public IP address. This must not be done in production!
    • Waits for a client to connect and provide a port number to call back to the debugger
    • Attempts to connect to the provided debug port on 127.0.0.1 with pydevd_pycharm.settrace

Code:

#!/usr/bin/env /bin/bash
# This file belongs in /usr/bin/python3.7
# Created by GGP

set -x
PATH=$PATH:/bin:/usr/bin
# Get a reference to the basename application in case its PATH entry is removed below
BASENAME=$(which basename)

# Get the current directory so it can be removed from the path so we don't call this script recursively
CURRENT_DIRECTORY=$(dirname $(readlink -f $0))
# Escape the current directory so it can be used as a regex
CURRENT_DIRECTORY=$(echo $CURRENT_DIRECTORY | sed -e 's/\//\\\//g')
# Remove the current directory from the path
PATH=$(echo $PATH | sed -e "s/$CURRENT_DIRECTORY//g")
# Remove any double colons that can be left over
PATH=$(echo $PATH | sed -e 's/::/:/g')
# Remove any leading colons
PATH=$(echo $PATH | sed -e 's/^://g')
# Remove any trailing colons
PATH=$(echo $PATH | sed -e 's/:$//g')
echo $PATH

[[ -z "$DEBUG_PORT" ]] && DEBUG_PORT=$(((RANDOM % 2000 + 8000)))

# Look for the Python version with a qualified name (e.g. python3.7)
PYTHON=$(which $($BASENAME ${EXPECTED_PYTHON3_LOCATION}))
PYTHON_MISSING=$?

if [ $PYTHON_MISSING -eq 1 ]; then
  # Look for a non-qualified Python 3 version
  PYTHON=$(which python3)
  PYTHON_MISSING=$?
fi

if [ $PYTHON_MISSING -eq 1 ]; then
  # Look for a specific path but only if this script does not have the same name
  PYTHON="/usr/local/bin/$PYTHON"
  CURRENT_EXECUTABLE=$0

  # If this is the same script, do not use it
  [[ "$CURRENT_EXECUTABLE" == "$PYTHON" ]] && unset $PYTHON
  # Could not find Python
  [[ ! -f "$PYTHON" ]] && unset $PYTHON
fi

if [ -z "$PYTHON" ]; then
  echo "Could not find Python 3.7, can not continue"
  exit 1
fi

echo $PYTHON "$@"

if [ "$DEBUG" == "true" ]; then
  # Make sure we have unbuffered I/O (-u)
  cat <<EOF | eval "$PYTHON -u - $*"
import sys
import os

serversocket = None

def wait_for_connections():
    if serversocket is None:
        print ("No server socket present, debugging disabled")

    while 1:
        (clientsocket, address) = serversocket.accept()
        clientsocket.send(str.encode("Enter a port number: "))
        data = clientsocket.recv(5)
        outbound_port_string = data.decode("utf-8")
        clientsocket.send(str.encode("Attempting to connect to debugger on 127.0.0.1, port " + outbound_port_string))
        clientsocket.close()
        pydevd_pycharm.settrace('127.0.0.1', port=int(outbound_port_string), stdoutToServer=True, stderrToServer=True)

try:
    import _thread
    import socket
    import pydevd_pycharm
    print ("pydevd_pycharm library imported, debugging enabled, connect on localhost port $DEBUG_PORT, enter a port number, and it will call back on localhost on that port")
    serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    serversocket.bind(('127.0.0.1', $DEBUG_PORT))
    serversocket.listen(1)
    _thread.start_new_thread(wait_for_connections, ())

except ImportError:
    print ('pydevd_pycharm library missing, debugging disabled')

# Remove the dash argument
sys.argv.pop(0)

# Find first .py argument to get the Lambda runtime

found = False

for arg in sys.argv:
    if arg.endswith('.py'):
        # Find the directory that the Lambda runtime is in
        import_directory = os.path.dirname(arg)
        # Add the directory to the system path
        sys.path.insert(0, os.path.abspath(import_directory));
        # Find the name of the Lambda runtime and remove the suffix
        import_name = os.path.basename(arg).replace('.py', '')
        # Import the Lambda runtime
        runtime = __import__(import_name)
        runtime.main()

        found = True
        break

if not found:
    print ('Could not find Lambda runtime, this should never happen')
    sys.exit(1)
EOF

else
  # Just run Python normally
  eval "$PYTHON $*"
fi