The Substitution Rule is another technique for integrating complex functions and is the corresponding process of integration as the chain rule is to differentiation. The Substitution Rule is applicable to a wide variety of integrals, but is most performant when the integral in question is similar to forms where the Chain Rule would be applicable. In this post, the Substitution Rule is explored with several examples. Python and SymPy are also used to verify our results.

## Antiderivatives

Antiderivatives, which are also referred to as indefinite integrals or primitive functions, is essentially the opposite of a derivative (hence the name). More formally, an antiderivative \(F\) is a function whose derivative is equivalent to the original function \(f\), or stated more concisely: \(F^\prime(x) = f(x)\). The Fundamental Theorem of Calculus defines the relationship between differential and integral calculus. We will see later that an antiderivative can be thought of as a restatement of an indefinite integral. Therefore, the discussion of antiderivatives provides a nice segue from the differential to integral calculus. The process of finding an antiderivative of a function is known as antidifferentiation and is the reverse of differentiating a function.

## Newton's Method for Finding Equation Roots

Newton's method, also known as Newton-Raphson, is an approach for finding the roots of nonlinear equations and is one of the most common root-finding algorithms due to its relative simplicity and speed. The root of a function is the point at which \(f(x) = 0\). This post explores the how Newton's Method works for finding roots of equations and walks through several examples with SymPy to verify our answers.

## Implicit Differentiation

An implicit function defines an algebraic relationship between variables. In this post, implicit differentiation is explored with several examples including solutions using Python code.

## The Chain Rule of Differentiation

The chain rule is a powerful and useful derivation technique that allows the derivation of functions that would not be straightforward or possible with the only the previously discussed rules at our disposal. The rule takes advantage of the "compositeness" of a function. In this post, we will explore several examples of the chain rule and will also confirm our results using the SymPy symbolic computation library.

## Limit of a Function

A function limit, roughly speaking, describes the behavior of a function around a specific value. Limits play a role in the definition of the derivative and function continuity and are also used in the convergent sequences. In this post, we will explore the definition of a function limit and some other limit laws using examples with Python.

## Derivatives of Logarithmic Functions

Implicit differentiation can also be employed to find the derivatives of logarithmic functions, which are of the form \(y = \log_a{x}\). In this post, we explore several derivatives of logarithmic functions and also prove some commonly used derivatives. The symbolic computation library SymPy is also employed to verify our answers.

## Product, Quotient and Power Rules of Differentiation

Several rules exist for finding the derivatives of functions with several components such as \(x \space sin \space x\). With these rules and the chain rule, which will be explored later, any derivative of a function can be found (assuming they exist). There are five rules that help simplify the computation of derivatives, of which each will be explored in turn. We will also take advantage of SymPy to perform symbolic computations to confirm our results.

## Continuous Functions

In this post, we explore the definition of a continuous function and introduce several examples with Python code.

## Tukey's Test for Post-Hoc Analysis

After a multivariate test, it is often desired to know more about the specific groups to find out if they are significantly different or similar. This step after analysis is referred to as 'post-hoc analysis' and is a major step in hypothesis testing. One common and popular method of post-hoc analysis is Tukey's Test. The test is known by several different names. Tukey's test compares the means of all treatments to the mean of every other treatment and is considered the best available method in cases when confidence intervals are desired or if sample sizes are unequal.

## Kruskal-Wallis One-Way Analysis of Variance of Ranks

The Kruskal-Wallis test extends the Mann-Whitney-Wilcoxon Rank Sum test for more than two groups. The test is nonparametric similar to the Mann-Whitney test and as such does not assume the data are normally distributed and can, therefore, be used when the assumption of normality is violated. This example will employ the Kruskal-Wallis test on the

`PlantGrowth`

dataset as used in previous examples. Although the data appear to be approximately normally distributed as seen before, the Kruskal-Wallis test performs just as well as a parametric test.## Quadratic Discriminant Analysis of Several Groups

Quadratic discriminant analysis for classification is a modification of linear discriminant analysis that does not assume equal covariance matrices amongst the groups (\(\Sigma_1, \Sigma_2, \cdots, \Sigma_k\)). Similar to LDA for several groups, quadratic discriminant analysis for several groups classification seeks to find the group that maximizes the quadratic classification function and assign the observation vector \(y\) to that group.

## Quadratic Discriminant Analysis of Two Groups

LDA assumes the groups in question have equal covariance matrices (\(\Sigma_1 = \Sigma_2 = \cdots = \Sigma_k\)). Therefore, when the groups do not have equal covariance matrices, observations are frequently assigned to groups with large variances on the diagonal of its corresponding covariance matrix (Rencher, n.d., pp. 321). Quadratic discriminant analysis is a modification of LDA that does not assume equal covariance matrices amongst the groups. In quadratic discriminant analysis, the respective covariance matrix \(S_i\) of the \(i^{th}\) group is employed in predicting the group membership of an observation, rather than the pooled covariance matrix \(S_{p1}\) in linear discriminant analysis.

## Linear Discriminant Analysis for the Classification of Several Groups

Similar to the two-group linear discriminant analysis for classification case, LDA for classification into several groups seeks to find the mean vector that the new observation \(y\) is closest to and assign \(y\) accordingly using a distance function. The several group case also assumes equal covariance matrices amongst the groups (\(\Sigma_1 = \Sigma_2 = \cdots = \Sigma_k\)).

## Linear Discriminant Analysis for the Classification of Two Groups

In this post, we will use the discriminant functions found in the first post to classify the observations. We will also employ cross-validation on the predicted groups to get a realistic sense of how the model would perform in practice on new observations. Linear classification analysis assumes the populations have equal covariance matrices (\(\Sigma_1 = \Sigma_2\)) but does not assume the data are normally distributed.

## Discriminant Analysis for Group Separation

Discriminant analysis assumes the two samples or populations being compared have the same covariance matrix \(\Sigma\) but distinct mean vectors \(\mu_1\) and \(\mu_2\) with \(p\) variables. The discriminant function that maximizes the separation of the groups is the linear combination of the \(p\) variables. The linear combination denoted \(z = a′y\) transforms the observation vectors to a scalar. The discriminant functions thus take the form:

## Discriminant Analysis of Several Groups

Discriminant analysis is also applicable in the case of more than two groups. In the first post on discriminant analysis, there was only one linear discriminant function as the number of linear discriminant functions is \(s = min(p, k − 1)\), where \(p\) is the number of dependent variables and \(k\) is the number of groups. In the case of more than two groups, there will be more than one linear discriminant function, which allows us to examine the groups' separation in more than one dimension.

## Building a Poetry Database in PostgreSQL with Python, poetpy, pandas and Sqlalchemy

In this example, we walk through a sample use case of extracting data from a database using an API and then structuring that data in a cohesive manner that allows us to create a relational database that we can then query with SQL statements. The database we will create with the extracted data will use Postgresql. The Python libraries that will be used in this example are poetpy, a Python wrapper for the PoetryDB API written by yours truly, pandas for transforming and cleansing the data as needed, and sqlalchemy for handling the SQL side of things.

## Introduction to Rpoet

The Rpoet package is a wrapper of the PoetryDB API, which enables developers and other users to extract a vast amount of English-language poetry from nearly 130 authors. The package provides a simple R interface for interacting and accessing the PoetryDB database. This vignette will introduce the basic functionality of Rpoet and some example usages of the package.

## Calculating and Performing One-way Multivariate Analysis of Variance (MANOVA)

MANOVA, or Multiple Analysis of Variance, is an extension of Analysis of Variance (ANOVA) to several dependent variables. The approach to MANOVA is similar to ANOVA in many regards and requires the same assumptions (normally distributed dependent variables with equal covariance matrices).

### Categories

- Analysis
- Calculus
- Data Science
- Finance
- Linear Algebra
- Machine Learning
- nasapy
- petpy
- poetpy
- Python
- R
- SQL
- Statistics