Diprotic Acids
The math description of diprotic acids (including carbonic acid as a case of exceptional interest) leads to simple analytical equations. The derivation will give us a better understanding of what happens “inside” hydrochemical models and software.^{1}
Basic Set of Mathematical Equations
When a diprotic acid H_{2}A (the solute) is added to pure water (the solvent), the equilibrium state of the solution is characterized by five dissolved species: H^{+}, OH^{}, H_{2}A, HA^{}, and A^{2}.
Thus, five equations are required for its math description:^{2}
(1a)  K_{1}  = {H^{+}} {HA^{}} / {H_{2}A}  (1^{st} diss. step) 
(1b)  K_{2}  = {H^{+}} {A^{2}} / {HA^{}}  (2^{nd} diss. step) 
(1c)  K_{w}  = {H^{+}} {OH^{}}  (selfionization of water) 
(1d)  C_{T}  = [H_{2}A] + [HA^{}] + [A^{2}]  (mass balance) 
(1e)  0  = [H^{+}] – [HA^{}] – 2 [A^{2}] – [OH^{}]  (charge balance) 
The first three equations are massaction laws; the last two equations represent the mass balance and the charge balance. While the massaction laws are based on activities (here denoted by curly braces), the massbalance and chargebalance equations rely on molar concentrations (denoted by square brackets).
An exact solution in closed form (i.e. an analytical formula) is only obtainable if the activities in the first three equations are replaced by molar concentrations.^{3} This is valid either in (very) dilute systems or by switching to conditional equilibrium constants ^{c}K. In the following we assume that this has been done (without explicitly introducing the notation ^{c}K).
Ionization Fractions
Let’s start with the acidspecies distribution as a function of pH. To study this behavior, a subset of the above equation system is sufficient, consisting of three equations only: (1a), (1b), and (1d). From the first two equations one gets (with the abbreviation x = {H^{+}} = 10^{pH}):
(2)  [H_{2}A] = (x/K_{1}) [HA^{}] and [A^{2}] = (K_{2}/x) [HA^{}] 
Entering it into 1d yields
(3)  C_{T} = (x/K_{1} + 1 + K_{2}/x) [HA^{}] 
This allows us to write the following simple formulas for the three dissolved species:
(4)  [H_{2}A] = C_{T} a_{0}  [HA^{}] = C_{T} a_{1}  [A^{2}] = C_{T} a_{2} 
with the three ionization fractions:
(5a)  a_{0} = [ 1 + K_{1}/x + K_{1}K_{2}/x^{2} ]^{1}  
(5b)  a_{1} = [ x/K_{1} + 1 + K_{2}/x ]^{1}  =  (K_{1}/x) a_{0}  
(5c)  a_{2} = [ x^{2}/(K_{1}K_{2}) + x/K_{2} + 1 ]^{1}  =  (K_{1}K_{2}/x^{2}) a_{0} 
It’s easy to check that all three coefficients add up to 1:
(6)  a_{0} + a_{1} + a_{2} = 1  (mass balance) 
Because of their elegance and simplicity, diagrams of ionization fractions (also known as Bjerrum plots) appear in almost every textbook on hydrochemistry. Below is an example for the carbonic acid system (with pK_{1} = 6.35, pK_{2} = 10.33):
The three small circles in the diagram represent equivalence points.
The concentrations of the three acid species in 4 can also be combined in one formula:
(7)  [H_{2j }A^{j}] = C_{T} a_{j}(x)  for j = 0, 1, 2 
This formula, together with 5, predicts the pH dependence of the three acid species. Aside from the normalization constant C_{T}, the concentration curves correspond to the ionization curves in the above diagram.
Be careful though, C_{T} is not a constant, as 7 would suggest; C_{T} depends on pH — as shown below — and can therefore not be regarded as an independent parameter. This misunderstanding comes from the fact that we have so far ignored both the charge balance and the selfionization of the water, i.e. 1e and 1c.
Exact Analytical Solution
The problem mentioned above is solved by incorporating two constraints: charge balance and the selfionization of water. Setting 1d into 1e and using 7, together with the shorthand y_{j} = [H_{2j }A^{j}], yields:
0  = x – y_{1} – 2 y_{2} – K_{w}/x  
= x – K_{w}/x – (y_{1} + 2 y_{2})  
= x – K_{w}/x – C_{T} (a_{1} + 2 a_{2}) 
This provides the exact relationship between the total amount of acid C_{T} and the pH value (= –lg x):
(8)  \(C_T(x) \ =\ \dfrac{xK_w/x}{a_1 + 2a_2} \ =\ \left(x\dfrac{K_w}{x}\right) \ \dfrac{K_2/x + 1 + x/K_1} {1 + 2K_2/x}\) 
In fact, this oneliner encapsulates the entire information contained in the set of five nonlinear equations, i.e. 1a to (1e).
Based on 8 we are in a position to replace the approximate formula in 7 by an exact formula valid for all three acid species:
(9)  [H_{2j }A^{j}] = \(\left( \dfrac{xK_w/x}{a_1 + 2a_2} \right) \ a_j\)  for j = 0, 1, 2 
[Example: The equations above were applied for the description of the closed and the open CO_{2} system.]
Inverse Task. Given the pH (or x), 8 calculates C_{T}. The inverse task to calculate the pH (or x) for a given C_{T}, however, is intricate, because an explicit function, such as pH = f(C_{T}), does not exist. The only thing we can offer is an implicit function in form of a polynomial of degree 4 in x, which is a quartic equation:
(10)  x^{4} + K_{1} x^{3} + (K_{1}K_{2} – C_{T }K_{1} – K_{w}) x^{2} – K_{1} (2C_{T }K_{2} + K_{w}) x – K_{1}K_{2}K_{w} = 0 
To recap: There is 8, there is 10, and there is the set of five equations defined in (1). All three entities are equivalent; they represent one and the same thing: the complete math description of a diprotic acid. Surely, calculating C_{T} for a given x (or pH) by 8 is much easier than to solve a 4^{th} order equation to get x (or pH) for a given value of C_{T}.
Diprotic Acids including Ampholytes and Conjugate Bases
Any diprotic acid is tightknit with its conjugate base(s), H_{2}A ⇔ BHA ⇔ B_{2}A, where B refers to the cation of a monoacidic base (B^{+} = Na^{+}, K^{+}, or NH_{4}^{+}). For example: H_{2}CO_{3}, NaHCO_{3}, and Na_{2}CO_{3} represents such an acidampholytebase triple.
Let’s denote the stoichiometric coefficient of B^{+} by n, then we get the compact notation:
(11)  B_{n}H_{2n }A  (or H_{2n }A^{n})  with  n = 0  for acid  (H_{2}A) 
n = 1  for ampholyte  (BHA)  
n = 2  for base  (B_{2}A) 
The set of equations to describe this system is
(12a)  K_{1}  = {H^{+}} {HA^{}} / {H_{2}A}  (1^{st} diss. step) 
(12b)  K_{2}  = {H^{+}} {A^{2}} / {HA^{}}  (2^{nd} diss. step) 
(12c)  K_{w}  = {H^{+}} {OH^{}}  (selfionization) 
(12d)  C_{T}  = [H_{2}A] + [HA^{}] + [A^{2}]  (mass balance) 
(12e)  0  = [H^{+}] + n [H_{2}A] + (n1) [HA^{}] + (n2) [A^{2}] – [OH^{}]  (proton balance) 
It differs from the set of equations (1) only by a single equation, namely the last line, where “charge balance” is replaced by the more general concept of proton balance.^{4}
Remarkably enough, the last equation (12e) is the sole equation that explicitly depends on n. The other four equations are independent on the type of reactant we add to water (acid, ampholyte, or base). In particular, the ionization fractions derived in 5a to (5c) for the diprotic acid, H_{2}A, are independent of n; they remain the same in our extended approach.
The set of equations (12) represents the core for the math description of buffer systems.
Exact Relationship between pH and C_{T}
The entire set of equations defined in 12a to (12e) can be condensed into a single formula, much like it was done in 8 above:
(13)  C_{T}(n,x) = \(\dfrac{xK_w/x}{a_1 + 2a_2  n} \, =\, \left(x\dfrac{K_w}{x}\right)\, \left(\dfrac{1+2K_2/x} {x/K_1 + 1 + K_2/x}  n\right)^{1}\) 
For n=0, it falls back to 8. Based on 13 we get — in place of 9 — the generalized formula for the three acid species:
(14)  [H_{2j }A^{j}] = \(\left( \dfrac{xK_w/x}{a_1 + 2a_2  n} \right)\ a_j\)  for j = 0, 1, 2 
Inverse Task. The conversion of C_{T}(n,x) into its inverse form x(n,C_{T}) leads again to a polynomial of degree 4 in x (quartic equation):
(15)  x^{4} + {K_{1} + nC_{T}} x^{3} + {K_{1}K_{2} + (n–1)C_{T }K_{1} – K_{w}} x^{2} + K_{1} {(n–2)C_{T }K_{2} – K_{w}} x – K_{1}K_{2}K_{w} = 0 
Each formula, whether 13 or 15, mimics three equations in compact form: one for an acid (n=0) — already presented in 8 and (10), one for an ampholyte (n=1), and one for a base (n=2).
Plots. The diagram below displays C_{T} as a function of pH. The solid lines represent 13 for n = 0, 1, and 2. The dots are exact results calculated with aqion (or PhreeqC), where activity corrections are considered. [Note: Activity corrections are especially relevant for high concentrations, i.e. high ionic strengths.]
The prefactor in the parenthesis of 13, x – K_{w}/x, becomes zero at pH = 7.0, i.e. at x = 10^{7}. This must be the case, because C_{T}=0 means “pure water” (where all the curves come together).
Proton Balance Equation (Proton Condition)
The proton balance was used in 12e. It is a balance between the species that have excess protons versus those that are deficient in protons relative to a defined proton reference level (PRL).
Example 1. The simplest case is pure water with its three species H^{+}, OH^{}, and H_{2}O. Choosing H_{2}O as the reference level, the species H^{+} (or H_{3}O^{+}) is enriched in 1 proton (excess proton), while OH^{} is depleted in 1 proton (deficient proton). The proton balance equation becomes:^{5}
PRL  excess protons  =  deficient protons  

(16)  H_{2}O  [H^{+}]  =  [OH^{}] 
Because water is everpresent in a acidbase system, H^{+} and OH^{} always enter the proton balance, one on the left and the other on the righthand side of the equation.
Example 2. The carbonic acid system has three distinct reference levels:^{6}
PRL  excess protons  =  deficient protons  

(17a)  H_{2}CO_{3}  [H^{+}]  =  [HCO_{3}^{}] + 2 [CO_{3}^{2}] + [OH^{}] 
(17b)  HCO_{3}^{}  [H^{+}] + [H_{2}CO_{3}]  =  [CO_{3}^{2}] + [OH^{}] 
(17c)  CO_{3}^{2}  [H^{+}] + 2 [H_{2}CO_{3}] + [HCO_{3}^{}]  =  [OH^{}] 
How do you obtain these equations?
First, the two species H^{+} and OH^{} that appear in each equation trace back from the H_{2}Oreference level in 16.^{7} They have their permanent place on opposite sides in any proton balance. Thus, all we have to do is to add the carbonicacid species (H_{2}CO_{3}, HCO_{3}^{}, CO_{3}^{2}) to the correct side of the equation.
In 17a, H_{2}CO_{3} is the reference level. There are no carbonate species that have more protons than H_{2}CO_{3}, hence, there is nothing to add to the lefthand side. Conversely, HCO_{3}^{} is deficient by 1 proton and CO_{3}^{2} by 2 protons; therefore, both species enter the righthand side.^{8}
In 17b, HCO_{3}^{} is the reference level. From this perspective, H_{2}CO_{3} has 1 excess proton (species enters the lefthand side), while CO_{3}^{2} is deficient by 1 proton (species enters the righthand side).
In 17c, CO_{3}^{2} is the reference level. Now, H_{2}CO_{3} has 2 excess protons and HCO_{3}^{} has 1 excess proton (both species enter the lefthand side); but there are no species that have less protons than CO_{3}^{2} (i.e. no carbonate species enters the righthand side).
General Case. Given the protonreference level by H_{2n }A^{n}, the proton balance equation becomes (for n = 0, 1, 2):
PRL  0  =  excess protons – deficient protons  

(18)  H_{2n }A^{n}  0  =  [H^{+}] + n [H_{2}A] + (n1) [HA^{}] + (n2) [A^{2}] – [OH^{}] 
This oneliner comprises all three equations of Example 2. Equation (18) was adopted in 12e above.
The proton reference level (PRL) is closely related to the concept of alkalinity and equivalence points (often both terms are used as synonyms).
Remarks & Footnotes

An alternative description, based on the tableaux method, is presented as PowerPoint. (Perhaps the best introduction to the tableaux method is given in the classical textbook of F.M.M. Morel and J.G. Hering: Principles and Applications of Aquatic Chemistry, John Wiley, 1993). ↩

For a rigor math description of Nprotic acids we refer to the review (2021) or lecture (2023). ↩

except for H^{+}. Replacing [H^{+}] by {H^{+}} is not necessary, because the pH value is related to the activity of H^{+} (not concentration). ↩

It is not necessary to build the theory upon the proton balance; instead of the proton balance one can also use an (extended) charge balance — like here or here. ↩

Square brackets denote molar concentrations. ↩

In hydrochemistry, instead of H_{2}CO_{3} the composite carbonic acid H_{2}CO_{3}^{*} is used. [In the program, H_{2}CO_{3}^{*} is abbreviated by CO_{2}, because almost all of H_{2}CO_{3}^{*} is just dissolved CO_{2}.] ↩

The reference level “H_{2}O” is not extra indicated in the table’s PRL column. But keep in mind that it is always present (in addition to H_{2}CO_{3}, HCO_{3}^{} or CO_{3}^{2}). ↩

If a species has lost 2 protons relative to PRL, its concentration is multiplied by 2. ↩