Taking Assumptions with Peruvian Presidential Voting Data with Azure Synapse Serverless

Last week I published a post that allows users use Peruvian voting data to Azure Synapse Serverless.

This post will allow us take assumptions to visualize votes on the 1st round that aligned to either left or right, and compare them to the 2nd round, where Pedro Castillo (PerĂº Libre) represents the left, and Keiko Fujimori (Fuerza Popular) represents the right.

These is how we are assuming Left and Right from the first round:

VOTOS_P1PARTIDO NACIONALISTA PERUANOIzquierda
VOTOS_P2EL FRENTE AMPLIO POR JUSTICIA, VIDA Y LIBERTADIzquierda
VOTOS_P3PARTIDO MORADOIzquierda
VOTOS_P4PERU PATRIA SEGURAIzquierda
VOTOS_P5VICTORIA NACIONALDerecha
VOTOS_P6ACCION POPULARDerecha
VOTOS_P7AVANZA PAIS – PARTIDO DE INTEGRACION SOCIALIzquierda
VOTOS_P8PODEMOS PERUIzquierda
VOTOS_P9JUNTOS POR EL PERUIzquierda
VOTOS_P10PARTIDO POPULAR CRISTIANO – PPCDerecha
VOTOS_P11FUERZA POPULARDerecha
VOTOS_P12UNION POR EL PERUIzquierda
VOTOS_P13RENOVACION POPULARDerecha
VOTOS_P14RENACIMIENTO UNIDO NACIONALIzquierda
VOTOS_P15PARTIDO DEMOCRATICO SOMOS PERUDerecha
VOTOS_P16PARTIDO POLITICO NACIONAL PERU LIBREIzquierda
VOTOS_P17DEMOCRACIA DIRECTADerecha
VOTOS_P18ALIANZA PARA EL PROGRESODerecha

Combining Left and Right votes from first round

The view I have created last time will be extended to combine what have assumed are votes for left and right, to be compared to votes of second round.

CREATE VIEW ONPE_ASSUMPTION AS 
SELECT
    DEPARTAMENTO,
    PROVINCIA,
    DISTRITO,
    MESA_DE_VOTACION,
    DESCRIP_ESTADO_ACTA_1ra,
    DESCRIP_ESTADO_ACTA_2da,
    N_CVAS_1ra,
    N_CVAS_2da,
    VOTOS_P1,
    VOTOS_P2,
    VOTOS_P3,
    VOTOS_P4,
    VOTOS_P5,
    VOTOS_P6,
    VOTOS_P7,
    VOTOS_P8,
    VOTOS_P9,
    VOTOS_P10,
    VOTOS_P12,
    VOTOS_P13,
    VOTOS_P14,
    VOTOS_P15,
    VOTOS_P17,
    VOTOS_P18,
    FP_1ra,
    PL_1ra,
    VOTOS_VB_1ra,
    VOTOS_VN_1ra,
    FP_2da,
    PL_2da,
    VOTOS_VB_2da,
    VOTOS_VN_2da,
    VOTOS_P1 + VOTOS_P2 + VOTOS_P3 + VOTOS_P4 + VOTOS_P7 + VOTOS_P8 + VOTOS_P9 + VOTOS_P12 + VOTOS_P14 + PL_1ra AS VOTOS_IZQ_1ra,
    VOTOS_P5 + VOTOS_P6 + VOTOS_P10 + FP_1ra + VOTOS_P13 + VOTOS_P15 + VOTOS_P17 + VOTOS_P18 AS VOTOS_DER_1ra
FROM ONPE;

Let’s take a look at the data:

SELECT TOP 100
    DEPARTAMENTO,
    PROVINCIA,
    DISTRITO,
    MESA_DE_VOTACION,
    FP_2da,
    PL_2da, 
    VOTOS_DER_1ra,
    VOTOS_IZQ_1ra
FROM ONPE_ASSUMPTION;

Query combined votes and compare with 2nd round

On this example, we are going to extend from the query used on the prior post, to add assumptions of left and right combined votes, using Common Table Expressions.

FULL OUTER JOIN is needed, as not all scenarios might be present for # of votes for each table.

WITH CTE_FP AS (
    SELECT
        FP_2da AS NUMERO_DE_VOTOS,
        COUNT(MESA_DE_VOTACION) AS CANTIDAD_DEMESAS
    FROM ONPE
    WHERE DESCRIP_ESTADO_ACTA_2da = 'CONTABILIZADA'
    GROUP BY FP_2da
), CTE_PL AS (
    SELECT
        PL_2da AS NUMERO_DE_VOTOS,
        COUNT(MESA_DE_VOTACION) AS CANTIDAD_DEMESAS
    FROM ONPE
    WHERE DESCRIP_ESTADO_ACTA_2da = 'CONTABILIZADA'
    GROUP BY PL_2da
), CTE_DER_1ra AS (
    SELECT
        VOTOS_DER_1ra AS NUMERO_DE_VOTOS,
        COUNT(MESA_DE_VOTACION) AS CANTIDAD_DEMESAS
    FROM ONPE_ASSUMPTION
    WHERE DESCRIP_ESTADO_ACTA_1ra = 'CONTABILIZADA'
    GROUP BY VOTOS_DER_1ra
), CTE_IZQ_1ra AS (
    SELECT
        VOTOS_IZQ_1ra AS NUMERO_DE_VOTOS,
        COUNT(MESA_DE_VOTACION) AS CANTIDAD_DEMESAS
    FROM ONPE_ASSUMPTION
    WHERE DESCRIP_ESTADO_ACTA_1ra = 'CONTABILIZADA'
    GROUP BY VOTOS_IZQ_1ra
)
SELECT 
    COALESCE(FP.NUMERO_DE_VOTOS, PL.NUMERO_DE_VOTOS) AS NUMERO_DE_VOTOS,
    COALESCE(FP.CANTIDAD_DEMESAS, 0) AS CANTIDAD_DEMESAS_FP,
    COALESCE(PL.CANTIDAD_DEMESAS, 0) AS CANTIDAD_DEMESAS_PL,
    COALESCE(D.CANTIDAD_DEMESAS, 0) AS CANTIDAD_DEMESAS_D,
    COALESCE(I.CANTIDAD_DEMESAS, 0) AS CANTIDAD_DEMESAS_I
FROM CTE_FP AS FP
    FULL OUTER JOIN CTE_PL PL ON (FP.NUMERO_DE_VOTOS = PL.NUMERO_DE_VOTOS)
    FULL OUTER JOIN CTE_DER_1ra D ON (FP.NUMERO_DE_VOTOS = D.NUMERO_DE_VOTOS)
    FULL OUTER JOIN CTE_IZQ_1ra I ON (PL.NUMERO_DE_VOTOS = I.NUMERO_DE_VOTOS)
ORDER BY COALESCE(FP.NUMERO_DE_VOTOS, PL.NUMERO_DE_VOTOS);

And the results like below:

Create Charts

If you switch to chart, then select “Area” under Chart Type, and exclude NUMERO_DE_VOTOS, and just keep CANTIDAD_DE_MESAS_FP, AND CANTIDAD_DE_MESAS_D, you visualize:

CANTIDAD_DE_MESAS_D implies all votes that were for Right candidates on 1st round. There is certainly not a bell shape on either case.

If we swap the options to keep CANTIDAD_DE_MESAS_PL, AND CANTIDAD_DE_MESAS_I, you visualize:

CANTIDAD_DE_MESAS_I implies all votes that were for Left candidates on 1st round. We could argue that they resemble a bell shape.

The learnings

Key aspect is that we can continue creating additional views from the original that feeds from data residing in the Data Lake Store, and leverage the data to visualize based on assumptions.