EWMA filter example using pandas and python

This article gives an example of how to use an exponentially weighted moving average filter to remove noise from a data set using the pandas library in python 3. I am writing this as the syntax for the library function has changed. The syntax I had been using is shown in Connor Johnoson’s well explained example here.
I will give some example code, plot the data sets then explain the code. The pandas documentation for this function is here. Like a lot of pandas documentation it is thorough, but could do with some more worked examples. I hope this article will plug some of that gap.
Here’s the example code:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
ewma = pd.Series.ewm

x = np.linspace(0, 2 * np.pi, 100)
y = 2 * np.sin(x) + 0.1 * np.random.normal(x)
df = pd.Series(y)
# take EWMA in both directions then average them
fwd = ewma(df,span=10).mean() # take EWMA in fwd direction
bwd = ewma(df[::-1],span=10).mean() # take EWMA in bwd direction
filtered = np.vstack(( fwd, bwd[::-1] )) # lump fwd and bwd together
filtered = np.mean(filtered, axis=0 ) # average
plt.title('filtered and raw data')
plt.plot(y, color = 'orange')
plt.plot(filtered, color='green')
plt.plot(fwd, color='red')
plt.plot(bwd, color='blue')

This produces the following plot. Orange line = noisy data set. Blue line = backwards filtered EWMA data set. Red line = forwards filtered EWMA data set. Green line = sum and average of the two EWMA data sets. This is the final filtered output.

EWMA fiiltered and raw data.

Let’s look at the example code. After importing the libraries I will need in lines 1-5, I create some example data. Line 6 creates 100 x values with values spaced evenly from 0 to 2 * pi. Line 7 creates 100 y-values from these 100 x-values. Each y value = 2*sin(x)+some noise. The noise is generated using the np.random.normal function. This noisy sine function is plotted in line 15 and can be seen as the jagged orange line on the plot.
Forwards and backwards EWMA filtered data sets are created in lines 10 and 11.
Line 10 starts with the first x-sample and the corresponding y-sample and works forwards and creates an EWMA filtered data set called fwd. This is plotted in line 17 as the red line.
Line 11 starts at the opposite end of the data set and works backwards to the first – this is the backwards EWMA filtered set, called bwd. This is plotted in line 18 as the blue line.
These two EWMA filtered data sets are added and averaged in lines 12-13. This data set is called filtered. This data set is plotted in line 16 as the green line.
If you look at the ewma functions in line 10 and 11, there is a parameter called span. This controls the width of the filter. The lag of the backwards EWMA data behind the final averaged filtered output is equal to this value. Similarly the forward EWMA data set has an offset forwards of the noisy data set equal to this value. Increasing the span increases the smoothing and the lag. Increasing the value will also reduce the peaks of the filtered data in relation to the unfiltered data. You need to try out different values.
My present application for this filter is removing jitter from accelerometer data. I have also used this filter to smooth signals from hydrophones.

Using pyzmq to communicate between GUIs and processes

Graphical user interfaces (GUIs) all want to be the main thread. They don’t play well together. Trying to run GUIs built with different libraries concurrently and get them to talk to one another took me a while to figure out. This article shows how I used the pyzmq library to communicate between two graphical user interfaces (GUIs). 

I am working on unique hand gesture recognition. One GUI represents a hand position. This is represented by a GUI built with pyqt with a few range sliders. The sliders will be used to represent pitch, roll and speed of motion in the final application. A second GUI represents the gesture recognition interface. For this example it is a simple label box set up in pyqtgraph. I used pyqtgraph as this is the tool kit I am using in my final application for real time data display from an accelerometer mounted on a hand. I based my pyzmq script on the examples here.
I played with the publisher subscriber (pubsub) examples. One of the nice things about the pubsub model is that if you send something from the publisher, even if there are no subscribers waiting for the message, nothing blocks or stalls your script. Pubsub is only one way communication, from the publisher to the subscriber. I opted instead to use the pair model. In this pattern, a socket is set up that allows an object at each end to send messages back and forwards.
Pyzmq comes with a partial implementation of the Tornado server. This is explained here. So you can set up an eventloop to trigger on poll events using ioloop. If you are already using a GUI, then odds on you have an events handler running in that GUI. Getting this event handling loops to play nicely with the Tornado server led me down the coding rabbit hole. So I opted to use the event handling loop set up by timer = QtCore.QTimer() in pyqtgraph to poll one end of the pyzmq pair socket that I set up. This is not aesthetic, but I can’t see a more reliable method. I am using this QTimer to enable animation of the sensor data that I am using for displaying hand position, so it is already running. Which ever method I use to set up receiving data from the hand posture GUI, at some point I have to decide to look at the data and use it. I thought about using the pyzmq.Queue structure, which is process safe. I could use this to automatically update a list in my sensor display GUI with new posture positions. This won’t be looked at until the QTimer triggers. So I may as well simplify things and look for the updated posture position in the QTimer handling method.
Here’s the code I use to generate the rangeslider GUI. This can be downloaded from: github. Most of this is boilerplate to produce the GUI. Lines 102-107 create the pyzmq pair socket. Note the try/except wrapper in lines 97-99 around the socket.send_string. This raises a zmq.error.Again exception if there is nothing to receive the message. Using the try/except wrapper allows the code to continue. The ‘flags=zmq.NOBLOCK’ stops the code from blocking if there is nothing at the other end of the socket to receive the message. This isn’t an issue with the pubsub model; a publisher doesn’t care if there is no subscriber around to receive the message, but the pair pattern will fail without a receiver unless you explicitly tell it not to block.
Created on 10 Oct 2016

@author: matthew oppenheim
use pyzmq pair context for communication

from multiprocessing import Process
from PyQt4 import QtGui, QtCore
from qrangeslider import QRangeSlider
import sys
import zmq
from zmq.eventloop import ioloop, zmqstream
from pubsub_zmq import PubZmq, SubZmq

class Example(QtGui.QWidget):
    def __init__(self):
        app = QtGui.QApplication(sys.argv)
        self.port = 5556
        self.topic = "1"

    def initUI(self):
        self.range_duration = QRangeSlider()   
        self.textbox = QtGui.QLineEdit()
        self.set_duration_btn = QtGui.QPushButton("send duration")
        self.range_pitch = QRangeSlider()    
        self.range_pitch.setRange(-20, 20)
        self.set_pitch_btn = QtGui.QPushButton("send pitch")
        self.range_roll = QRangeSlider()    
        self.range_roll.setRange(-20, 20)
        self.set_roll_btn = QtGui.QPushButton("send roll")
        hbox_duration = QtGui.QHBoxLayout()
        hbox_pitch = QtGui.QHBoxLayout()
        hbox_pitch = QtGui.QHBoxLayout()

        hbox_roll = QtGui.QHBoxLayout()

        vbox = QtGui.QVBoxLayout()
        self.setGeometry(300, 300, 300, 150)
        self.socket = self.create_socket(self.port)
    def button_click(self, message):
        ''' handle button click event '''
        self.textbox.setText('sent {}'.format(message))
            self.socket.send_string(message, flags=zmq.NOBLOCK)
        except zmq.error.Again as e:
            print('no receiver for the message: {}'.format(e))

    def create_socket(self, port):
        ''' create a socket using pyzmq with PAIR context '''
        context = zmq.Context()
        socket = context.socket(zmq.PAIR)
        socket.bind("tcp://*:%s" % port)
        return socket
if __name__ == '__main__':
    ex = Example()

Here’s the simple label box that I use to test out receiving messages:

pyqtgraph layout with a pyzmq pair context
for testing pubsub messaging with pyzmq
Created on 14 Oct 2016
using qt timer and polling instead of the tornado loop in zmq
@author: matthew oppenheim

import pyqtgraph as pg
from pyqtgraph.Qt import QtGui, QtCore
from pubsub_zmq import SubZmq
from multiprocessing import Process
import zmq
import sys
import time


class PyqtgraphPair(QtGui.QWidget):
    def __init__(self):
        port = '5556'
        topic = '1'
        self.layout = QtGui.QVBoxLayout()
        self.label = QtGui.QLabel("test")
        self.set_label("new label")
        self.socket = self.create_socket(port)

    def create_socket(self, port):
        context = zmq.Context()
        socket = context.socket(zmq.PAIR)
        socket.connect('tcp://localhost:%s' % port) 
        return socket

    def set_label(self, text):
        ''' set the label to text '''

    def timer_timeout(self):
        ''' handle the QTimer timeout '''
            msg = self.socket.recv(flags=zmq.NOBLOCK).decode()
            print('message received {}'.format(msg))
         except zmq.error.Again as e:
if __name__ == '__main__':
    win = PyqtgraphPair()
    timer = QtCore.QTimer()
    if (sys.flags.interactive != 1) or not hasattr(QtCore,

Polling for a new message takes place in line 61. This has the same try/except wrapper as in the rangeslider example.

python – how to communicate between threads using pydispatcher

The pydispatcher module makes it straight forwards to communicate between different threads in the same process in python.

Why would I want to do this?

I am collecting and processing sensor data from an accelerometer and want to display this real-time. The interface has some controls to save the data and to change the sampling rate of the sensor. Naturally, I want to interact with the user interface without having to wait for the sensor data to be collected and processed. I also want the sensor to be continuously sampled, not having to wait for the real-time display to update.

I run the the graphical user interface (GUI) in one thread and use a separate thread to handle getting data from the sensor. This way the sensor is continuously sampled and the display remains responsive.

I use pydispatcher to send sensor measurements from the sensor thread the display thread. I also use pydispatcher to communicate from the display thread back to the sensor thread to control the rate that the sensor collects data or to stop data collection. So I have two way communication between the threads. I pass numpy arrays from the sensor thread to the display and send text from the display thread to the sensor thread. The text is then interpreted by the sensor thread to alter the sensor sampling rate, or stop sampling. Pydispatcher does not seem to mind what kind of data is sent as messages.

The application that I have described takes up quite a lot of code and is split over several classes. So I will present the code for a simpler example, which shows how to set up and apply pydispatcher and introduces some of the features that makes the library versatile.

Here is an example python 3 script that creates two threads and has them communicate. When the script is executed, as it will have the __name__ as __main__, so lines 46-50 will be the first to execute. A thread that instigates the Alice class is defined and created in lines 47-48 and a separate thread that instigates the Bob class is defined then started in lines 49-50.

In line 26 the alice_thread thread prints out a message ‘Alice is procrastinating’ every second.

In line 43 the bob_thread sends a message to the alice_thread every three seconds using a dispatcher. The alice_thread reacts to this dispatcher message by returning a message of her own to the bob_thread using a separate dispatcher.

If we look at line 15 in the Alice class, a dispatcher listener is set up:

dispatcher.connect(self.alice_dispatcher_receive, signal=BOB_SIGNAL, sender=BOB_SENDER)

This means that when a dispatcher.send statement with the signal BOB_SIGNAL and sender BOB_SENDER is executed anywhere else in the process, the method alice_dispatcher will be triggered so long as an instance of the Alice class has been created. In line 43, the Bob class sets up a dispatcher sender, which is designed to trigger the dispatcher listener in the Alice class described above.

dispatcher.send(message='message from Bob', signal=BOB_SIGNAL, sender=BOB_SENDER)

Having signal and sender names for each dispatcher listener and sender is a little confusing at first. Why do we have to define two identifiers for the dispatcher? Being able to define two identifiers allows us to group dispatchers from the same sender, using the sender identifier. Then we can have the same sender class sending different types of signal, for example data from different sensors, each one with the same sender identifier but each one with different signal identifier. This is verbose, but this verbosity makes for unambiguous easy to maintain code.

Lines 6-9 define the names of the signals and senders for Alice and Bob.

When the alice_thread receives a dispatch from the bob_thread thread, she replies with a dispatch sender of her own (line 21). The corresponding dispatch listener is defined in the Bob class in line 33.

''' demonstrate the pydispatch module '''
from pydispatch import dispatcher
import threading
import time


class Alice():
''' alice procrastinates and replies to bob'''
def __init__(self):
print('alice instantiated')
dispatcher.connect(self.alice_dispatcher_receive, signal=BOB_SIGNAL, sender=BOB_SENDER)

def alice_dispatcher_receive(self, message):
''' handle dispatcher'''
print('alice has received message: {}'.format(message))
dispatcher.send(message='thankyou from Alice', signal=ALICE_SIGNAL, sender=ALICE_SENDER)

def alice(self):
''' loop and wait '''
print('Alice is procrastinating')

class Bob():
''' bob contacts alice periodically '''
def __init__(self):
print('Bob instantiated')
dispatcher.connect(self.bob_dispatcher_receive, signal=ALICE_SIGNAL, sender=ALICE_SENDER)

def bob_dispatcher_receive(self, message):
''' handle dispatcher '''
print('bob has received message: {}'.format(message))

def bob(self):
''' loop and send messages using a dispatcher '''
dispatcher.send(message='message from Bob', signal=BOB_SIGNAL, sender=BOB_SENDER)

if __name__ == '__main__':
alice_thread = threading.Thread(target=Alice)
bob_thread = threading.Thread(target=Bob)
alice instantiated
Alice is procrastinating
Bob instantiated
alice has received message: message from Bob
bob has received message: thankyou from Alice
Alice is procrastinating
Alice is procrastinating
Alice is procrastinating
alice has received message: message from Bob
bob has received message: thankyou from Alice
Alice is procrastinating
Alice is procrastinating
alice has received message: message from Bob
bob has received message: thankyou from Alice
Alice is procrastinating
Alice is procrastinating
Alice is procrastinating
alice has received message: message from Bob
bob has received message: thankyou from Alice

To conclude. There are different ways to communicate between threads in python. I choose pydispatcher as the library allows me to write code that I can understand when I come back to it 6 months later and I don’t have to worry about the type of message that I am passing between the threads.