Tag: Plotly

Classes vs. Functions in Python

Classes vs. Functions in Python

Using classes (or more broadly object-oriented programming) can really take your coding to the next level. While they can be a bit confusing at first, some of the things you can do with them would be very difficult, if not impossible, to do using function-based programming. A very good and concise definition of classes in Python can be found in the official documentation, which I’ll literally quote below:

“Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state.”

If you’re already using functions in Python, moving to classes is a very feasible step. The approach I will take in this post is to compare a class and a set of functions that do the same thing, i.e., some data fitting. As we go over the comparison, I will also be referencing bits and pieces of the quote above, hopefully making it more clear to understand.

First of all, you will want to bundle data and functionality that actually belong together. With that in mind, using classes will actually make sense in the long run.

The attributes is where the data (or states) are kept inside an instance of the class. Because of the nature of classes, each instance exists in the computer memory completely independent of other instances of the same class. In other words, the attributes (or properties) of each instance can hold different values. All instances still contain the same methods of the class. These are actions that can be performed by a given instance that usually modify the attribute values of that particular instance.

If you use functions on a regular basis and look inside a Python class, you’ll recognize that the methods are basically functions. That set of functions that you created to perform related tasks on your data can all be put inside a class. Besides making your code more modular and self-contained, the fact that you can have multiple instances of the same class can come in really handy, as we’ll see later on. All the code in this post can be found on my GitHub page, as well.

Let’s get started on our data fitting code. The data part is a set of ordered (x, y) pairs that will be used in a linear regression. The functionality part is the set of actions that can be performed on the data. More specifically: defining the data points, fitting a line to them, and plotting the original data points and the line of best fit.

As a general grammatical recommendation, variables (or attributes) are substantives, or groups of substantives, since they contain or define something. Variable names such as figuresize, x_values, numdatapoints are all typical in Python.

Since, functions or methods “do something” with “something”, I always start with a verb and then one or more substantives. So, open_figure, define_data, find_data_points are all valid names. It is a convention in Python to use underscores in function or method names. If you use underscores in your variable names, that verb is all you got to quickly distinguish between attributes (variables) and methods (functions) in your code.

Data Fitting Functions

With that said, let’s check below the three (very simple) functions that will be used. One important aspect is that most functions have input arguments and return values. In this simple example, the functions define_data and fit_data return dictionaries containing the output information. In larger projects, functions may be calling functions, who in turn call more functions. Handling and tracking input and output can become pretty complex. By then, most people resort to global variables. Which, in my opinion, should be avoided at all costs. Having local variables that stay local to their functions will make debugging the code much more straightforward.

import numpy as np
import plotly.graph_objects as go
from scipy.stats import linregress


def define_data(x, y, xname=None, yname=None):
    """
    Creates dictionary containing data information.

    """
    if len(x) == len(y):
        data = {
            'xname': xname,
            'yname': yname,
            'x': x,
            'y': y,
        }
    else:
        raise Exception("'x' and 'y' must have the same length.")
    return data


def fit_data(data):
    """
    Calculates linear regression to data points.

    """
    f = linregress(data['x'], data['y'])
    fit = {
        'slope': f.slope,
        'intercept': f.intercept,
        'r2': f.rvalue**2
    }
    print('Slope = {:1.3f}'.format(fit['slope']))
    print('Intercept = {:1.3f}'.format(fit['intercept']))
    print('R-squared = {:1.3f}'.format(fit['r2']))
    return fit


def plot_data(data, fit):
    """
    Creates scatter plot of data and best fit regression line.

    """
    # Making sure x and y values are numpy arrays
    x = np.array(data['x'])
    y = np.array(data['y'])
    # Creating plotly figure
    fig = go.Figure()
    # Adding data points
    fig.add_trace(
        go.Scatter(
            name='data',
            x=x,
            y=y,
            mode='markers',
            marker=dict(size=10, color='#FF0F0E')
        )
    )
    # Adding regression line
    fig.add_trace(
        go.Scatter(
            name='fit',
            x=x,
            y=fit['slope']*x+fit['intercept'],
            mode='lines',
            line=dict(dash='dot', color='#202020')
        )
    )
    # Adding other figure objects
    fig.update_xaxes(title_text=data['xname'])
    fig.update_yaxes(title_text=data['yname'])
    fig.update_layout(
        paper_bgcolor='#F8F8F8',
        plot_bgcolor='#FFFFFF',
        width=600, height=300,
        margin=dict(l=60, r=30, t=30, b=30),
        showlegend=False)
    fig.show()

Below is a little program that uses the three functions to fit some x, y values. Notice how inputs and outputs have to be passed around between functions. Also, if we want to fit a different set of data points, we start carrying around multiple variables in memory, which can eventually be overwritten or confused with something else. By the way, I’m assuming you saved a file named fitfunctons.py in a modules folder which is either on the Python path or is your current working folder.

from modules.fitfunctions import define_data, fit_data, plot_data

# Defining first data set
x1 = [0, 1, 2, 3, 4]
y1 = [2.1, 2.8, 4.2, 4.9, 5.1]
data1 = define_data(x1, y1, xname='x1', yname='y1')
# Fitting data
fit1 = fit_data(data1)
# Plotting results
plot_data(data1, fit1)

# Defining second data set
x2 = [0, 1, 2, 3, 4]
y2 = [3, 5.1, 6.8, 8.9, 11.2]
data2 = define_data(x2, y2, xname='x2', yname='y2')
# Fitting data
fit2 = fit_data(data2)
# Plotting results
plot_data(data2, fit2)

Data Fitting Class

Now let’s take a look at a class (saved in fitclass.py in the same modules folder) that contains the three functions as methods. As far as naming conventions in Python are concerned, classes use UpperCamelCase. Therefore, our class will be called FitData. In addition to the three functions (now as methods inside the class) there’s a constructor method, which is by convention named __init__. It’s in the constructor that we can initialize attributes, in this case the data and fit dictionaries, as well as call other methods. Notice that the first argument to all methods is self, which will hold the pointer to the class instance once that instance is created, and is how attributes and methods are accessed inside the class.

For instance, by passing only self to the fit_data method, I can access any attribute or method of the class by using the self argument with dot notation. For instance, self.data['x'] gets the x values array anytime I need it inside the class.

On the same token, methods don’t need to have return values (although sometimes you may want them to). They can use the self pointer to store the return values as a class attribute. The method fit_data gets the x, y values from the data attribute and stores the fitting parameters in the fit attribute, such as self.fit['slope'] = f.slope.

import numpy as np
import plotly.graph_objects as go
from scipy.stats import linregress


class FitData:
    def __init__(self):
        """
        Class constructor.

        """
        self.data = dict()
        self.fit = dict()

    def define_data(self, x, y, xname=None, yname=None):
        """
        Creates dictionary containing data information.

        """
        if len(x) == len(y):
            self.data['x'] = x
            self.data['y'] = y
            self.data['xname'] = xname
            self.data['yname'] = yname
        else:
            raise Exception("'x' and 'y' must have the same length.")

    def fit_data(self):
        """
        Calculates linear regression to data points.

        """
        f = linregress(self.data['x'], self.data['y'])
        self.fit['slope'] = f.slope
        self.fit['intercept'] = f.intercept
        self.fit['r2'] = f.rvalue**2
        print('Slope = {:1.3f}'.format(self.fit['slope']))
        print('Intercept = {:1.3f}'.format(self.fit['intercept']))
        print('R-squared = {:1.3f}'.format(self.fit['r2']))

    def plot_data(self):
        """
        Creates scatter plot of data and best fit regression line.

        """
        # Making sure x and y values are numpy arrays
        x = np.array(self.data['x'])
        y = np.array(self.data['y'])
        # Creating plotly figure
        fig = go.Figure()
        # Adding data points
        fig.add_trace(
            go.Scatter(
                name='data',
                x=x,
                y=y,
                mode='markers',
                marker=dict(size=10, color='#FF0F0E')
            )
        )
        # Adding regression line
        fig.add_trace(
            go.Scatter(
                name='fit',
                x=x,
                y=self.fit['slope']*x+self.fit['intercept'],
                mode='lines',
                line=dict(dash='dot', color='#202020')
            )
        )
        # Adding other figure objects
        fig.update_xaxes(title_text=self.data['xname'])
        fig.update_yaxes(title_text=self.data['yname'])
        fig.update_layout(
            paper_bgcolor='#F8F8F8',
            plot_bgcolor='#FFFFFF',
            width=600, height=300,
            margin=dict(l=60, r=30, t=30, b=30),
            showlegend=False
        )
        fig.show()

The program below can be contrasted with the one that employed the functions. We can start by creating two instances of the FitData class, which will be two distinct and independent objects in the computer memory. Each instance now holds the self pointer and can therefore access attributes and call methods using dot notation. fitdata1.plot_data() will run that method with whatever attributes are required and that are already stored in that particular instance. Also, print(fitdata2.data) would display the values assigned to the data attribute in the fitdata2 object.

from modules.fitclass import FitData

# Creating class instances
fitdata1 = FitData()
fitdata2 = FitData()

# Defining first data set
x1 = [0, 1, 2, 3, 4]
y1 = [2.1, 2.8, 4.2, 4.9, 5.1]
fitdata1.define_data(x1, y1, xname='x1', yname='y1')
# Fitting data
fitdata1.fit_data()
# Plotting results
fitdata1.plot_data()

# Defining second data set
x2 = [0, 1, 2, 3, 4]
y2 = [3, 5.1, 6.8, 8.9, 11.2]
fitdata2.define_data(x2, y2, xname='x2', yname='y2')
# Fitting data
fitdata2.fit_data()
# Plotting results
fitdata2.plot_data()

You should also observe that all the variables that were being passed around are now incapsulated inside each object, making for code that is more organized and less prone to errors. Of course, these examples are quite simple and keeping track of what’s going on is still pretty straightforward.

Finally, similar to functions, methods can call methods. But since input arguments and return values can all be stored inside the class attributes, nesting method calls becomes much cleaner and easier to do than nesting function calls.

Random Remarks

I was first exposed to Object-Oriented Programming a little over 6 years ago, when I was playing around with a LEGO EV3 Mindstorms. It was in MATLAB, which was my programming language of many, many years. It took me some time to wrap my head around the concept, but once I understood its power, a whole new dimension of programming opened up! I then took to write the code for an entire engine test cell automation system: GUIs, instrumentation interfaces, and everything in between. The level of complexity and modularization that was required in order to succeed was only achievable through the use of classes.

Speaking of GUIs, those are built upon the concept of classes. If you ever venture down that path, there’s another strong reason to learn more and start using them. Once you get it, you’ll never look back.

A useful Python plotting module

A useful Python plotting module

First of all, let’s talk a little bit about online Python documentation. At the top of my list is the official Python Documentation. It is extremely well written, with plenty of examples. I always start from the Tutorial page and dig from there. For the specific “how-to-do-that” questions, do a Google search and then ONLY choose answers from stackoverflow or stackexchange. Any other sites will probably flash more ads in front of your eyes than you can handle. Not to mention, their answers are likely copied from the two sites I mentioned.

Some of my examples throughout this blog use a plotting function that I made using Plotly. As far as graphing packages in Python are concerned, Plotly and Matplotlib are my favorites. However, I chose the former for this because its graphs are more interactive, which can be helpful when looking at data. In my opinion, you will get the most out of the function if you’re using VSCode and have the Jupyter notebook extensions installed. The Plotly graphs will show right there in the interactive window.

The plotting function should be inside a Python module, which I named utils.py. It’s a starting place for someone to add other functions or classes for tasks that are done routinely in programs. You can copy the code below and save it in a utils.py module and import it as needed when running some of the examples. Check out how to use it here.

""" utils.py

Contains a useful plotting function that is used in the coding examples.
The function was built using Plotly instead of Matplotlib due to its
interactive graphs and because it runs better on Raspberry Pi Linux.

Author: Eduardo Nigro
    rev 0.0.6
    2022-01-24
"""
import numpy as np
import plotly.io as pio
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Setting plotting modified template as default
mytemplate = pio.templates["plotly_white"]
mytemplate.layout["paper_bgcolor"] = "rgb(250, 250, 250)"
pio.templates.default = mytemplate


def plot_line(
    x, y, xname="Time (s)", yname=None, axes="single",
    figsize=None, line=True, marker=False, legend=None):
    """
    Plot lines using plotly.

    :param x: x values for plotting.
        List or ndarray. List of lists or ndarrays is also supported.
    :type x: list(float), list(list), list(ndarray)

    :param y: y values for plotting.
        List or ndarray. List of lists or ndarrays is also supported.
    :type y: list(float), list(list), list(ndarray)

    :param xname: The x axis title. Default value is ``'Time (s)'``.
        If ``'Angle (deg.)'`` is used, axes ticks are configured in 360 degree
        increments.
    :type xname: str

    :param yname: The y axis title.
        A string or list of strings containing the names of the y axis titles.
        If ``None``, the y axis titles will be ``'y0'``, ``'y1'``, etc.
    :type yname: str, list(str)

    :param axes: The configuration of axis on the plot.
        If ``'single'``, multiple curves are plotted on the same axis.
        If ``'multi'``, each curve is plotted on its own axis.
    :type axes: str

    :param figsize: The figure size (``width``, ``height``) in pixels.
    :type figsize: tuple(int)

    :param line: Displays the curves if ``True``.
    :type line: bool, list(bool)

    :param marker: Displays markers on the curves if ``True``.
    :type marker: bool, list(bool)

    :param legend: List of legend names for multiple curves.
        Length of `legend` must be the same as length of `y`.
    :type legend: list(str)

    Example:
        >>> import numpy as np
        >>> from utils import plot_line
        >>> t = np.linspace(0,2,100)
        >>> y0 = np.sin(1*np.pi*t)
        >>> y1 = np.cos(1*np.pi*t)
        >>> plot_line(
                [t]*2, [y0, y1],
                yname=['sin & cos'],
                legend=['sin(pi x t)', 'cos(pi x t)']
                )

    """
    # Making sure x and y inputs are put in lists if needed
    if type(x) != list:
        x = [x]
    else:
        if type(x[0]) not in [list, np.ndarray]:
            x = [x]
    if type(y) != list:
        y = [y]
    else:
        if type(y[0]) not in [list, np.ndarray]:
            y = [y]
    # Doing a simple check for consistent x and y inputs
    if len(x) != len(y):
        raise Exception("'x' and 'y' inputs must have the same length.")
    # Adjusting y axis title based on input
    if not yname:
        yname = ["y" + str(i) for i in range(len(y))]
    elif type(yname) != list:
        yname = [yname]
    if (len(yname) == 1) and (len(y) > 1):
        yname = yname * len(y)
    # Setting legend display option
    if legend is not None:
        if len(legend) == len(y):
            showlegend = True
        else:
            raise Exception("'y' and 'legend' must have the same length.")
    else:
        showlegend = False
        legend = [None] * len(y)
    # Checking for single (with multiple curves)
    # or multiple axes with one curve per axes
    if axes == "single":
        naxes = 1
        iaxes = [0] * len(y)
        colors = [
            "#1F77B4",
            "#FF7F0E",
            "#2CA02C",
            "#D62728",
            "#9467BD",
            "#8C564B",
            "#E377C2",
            "#7F7F7F",
            "#BCBD22",
            "#17BECF",
        ]
    elif axes == "multi":
        naxes = len(y)
        iaxes = range(0, len(y))
        colors = ["rgb(50, 100, 150)"] * len(y)
    else:
        raise Exception("Valid axes options are: 'single' or 'multi'.")
    # Checking for line and marker options
    if type(line) != list:
        line = [line] * len(y)
    if type(marker) != list:
        marker = [marker] * len(y)
    mode = []
    markersize = []
    for linei, markeri in zip(line, marker):
        if linei and markeri:
            mode.append("lines+markers")
            markersize.append(2)
        elif linei and not markeri:
            mode.append("lines")
            markersize.append(2)
        elif not linei and markeri:
            mode.append("markers")
            markersize.append(8)
    # Setting figure parameters
    if figsize:
        wfig, hfig = figsize
    else:
        wfig = 650
        hfig = 100 + 150*naxes
    m0 = 10
    margin = dict(l=6 * m0, r=3 * m0, t=3 * m0, b=3 * m0)
    # Plotting results
    fig = make_subplots(rows=naxes, cols=1)
    for i, xi, yi, ynamei, legendi, colori, modei, markersizei in zip(
            iaxes, x, y, yname, legend, colors, mode, markersize):
        # Adding x, y traces to appropriate plot
        fig.add_trace(
            go.Scatter(
                x=xi,
                y=yi,
                name=legendi,
                mode=modei,
                line=dict(width=1, color=colori),
                marker=dict(size=markersizei, color=colori),
            ),
            row=i + 1,
            col=1,
        )
        # Adding x axes ticks
        if xname.lower().find("angle") < 0:
            # Regular x axes
            fig.update_xaxes(matches="x", row=i + 1, col=1)
        else:
            # Special case where x axes has angular values
            fig.update_xaxes(
                tickmode="array",
                tickvals=np.arange(0, np.round(xi[-1]) + 360, 360),
                matches="x",
                row=i + 1,
                col=1,
            )
        # Adding y axis title to all plots
        fig.update_yaxes(title_text=ynamei, row=i + 1, col=1)
    # Adding x axis title to bottom plot only
    fig.update_xaxes(title_text=xname, row=i + 1, col=1)
    # Applying figure size, margins, and legend
    fig.update_layout(
        margin=margin, width=wfig, height=hfig, showlegend=showlegend)
    fig.show()