Survey: US Based Mexican Average Salary Between 47k to 67k Annually

From May 2nd to May 5th of 2020, I gathered data with an online survey administered with the help of Mexican based Data Analysts who helped me recruit participants and review the data. My goal was to understand how COVID-19 affected my community’s economic status and employment prospects.

The descriptions here apply to 77 confirmed Mexicans, with 48 Mexican individuals verified through known connections and ‘friends-of-friends’ in social media. The overall sample was larger than 160 individuals surveyed online. All were Spanish speaking, thus, classified as ‘Latino’ at a minimum if they did not opt into the ‘Mexican’ category.

Survey Goals

We wanted to know the average salary of Mexicans based in the United States; we wanted to know how Mexicans fared during the peak periods of uncertainty in the pandemic. We consider the peak period of uncertainty to be early March 2020 to June 2020 during the pandemic when no vaccine was available and the most stringent lockdowns took place.

Ultimately, we suspect that the salary prospects of Mexicans during the COVID-19 lockdown were mitigated by their participation in the construction and education sectors of the economy.

Initial Motivations

We wanted to maximize privacy, but understand the moment faced by the Mexican community. During this time period, remittances to Mexico broke records. Unfortunately, many erroneously attributed Mexican resilience to a meager and occasional stimulus check. Those Mexicans with deeper ties to Mexico tended to not have access to the stimulus check; their legal status as economic migrants would not support the idea.

There was also an additional need to hone in on Mexicans specifically, as most research tends to homogenize distinct communities under ‘Latino’. In this research, the outcomes for the entire population of Spanish speaking individuals with ties outside of the US are worse than what is presented in this report.

Ultimately, the data found within this survey is in harmony with the fact that Mexicans somehow found more work or income to send as remittances to Mexico during the multiple lockdowns.

Survey Administered In Spanish

To confirm we were indeed surveying self-identified Mexican individuals, we administered the survey in Spanish; we also exclusively distributed this survey within closed networks of the Mexican community. However, we also went through a paid-tier of Survey Monkey to examine whether those results remained consistent or affected the overall trajectory of the average salary range for Mexicans in the US. The act of including Survey Monkey participants raised the minimum average salary range too.

Due to mandatory social distancing, the usage of online survey methods was the only acceptable way to take a survey data-based snapshot of our community.

Survey Results

The survey results indicated that about half of Mexicans were not impacted in their employment or hours worked during the Coronavirus due to their links to education, clerical, health and construction sectors.

The four main sectors: Health, Education, Clerical and Construction – mostly WFH or essential sectors.

From the ‘most-trusted’ group of 48 survey participants who were verified through social media and prior in-person interactions, we determined that the average salary range was from US 47,347 dollars to 67,243 dollars for Mexicans. Including, the Survey Monkey group raises the lower range to 48,000 and upper range stays flat.

Students, Restaurant Workers

If we remove individuals who are underemployed due to their status as students, restaurant workers (many of whom received some moderate amount of unemployment or switched sectors), then the minimum average salary is raised to over 56 thousand dollars.

Thus, we could surmise that the average Mexican is making over 56 thousand dollars if active and fully employed in the labor market.

To be more exact, here is the average salary range for the fully employed Mexicans surveyed:

57812.567243.94
Fully Employed averages.

48 percent of the Mexicans surveyed make over 50000 USD annually:

Survey participants were asked which salary range their annual income fell under. Nearly half of Mexicans surveyed selected more than 50,000 annually.

When averaged out, the bottom range for an Average Mexican salary – for the participants surveyed in both SM and via closed networks – is 48571 US dollars annually to 67243 dollars annually:

48571.4367243.94
Lower and Upper Bound for average salary range for Mexican individuals in United States.
Bottom 9 percent were students, restaurant employees impacted by Covid-19 Lockdowns

If we included Survey Monkey based participants, we see a bump in Mexican minimum salaries jump to 51000 dollars annually.

Lockdown Impacts

The Coronavirus impacted about half of the Mexicans – mostly women – surveyed when it came to the number of hours worked. Those who were impacted tended to have major losses in hours worked. The rest lost between 1 to less than 20 hours of work.

Women made up 72 to 63 percent of Mexican respondents who were impacted adversely by Covid-19

Our numbers vary due to the exclusion or inclusion of Survey Monkey based respondents. If we went outside of our vetted participants, we received noisy data from individuals who can not be confirmed as Mexican or whom may not be operating in good faith. Nonetheless, income averages creeped higher and unemployment rates lowered when including these unvetted Mexicans.

Demographics of Apple Executives

Most Decision Makers Are White

EEOD DATA Indicates 99 out of 123 Execs Are White.

Graphic of @Apple executives based on most recent EEOD data

Recently, the National Labor Relations Board sued Apple. According to the suit, Apple has fired employees in a retaliatory fashion. The employee in question had alleged Sexual Harassment by a male colleague. Unfortunately, Apple management notified the accused party, shifted the victim towards a different line of work within the company and eventually fired the employee after having placed her on administrative leave.

With this recent news in mind, we wondered about the basic demographics of Apple executives. Generally, these are white people in control of major day to day decisions regarding personnel.

A lot of dollars are going to that elite cadre, perhaps, changing and influencing generations of professionals within the IT industry as well as creating generational wealth for those within their demographic:

80 Percent Of Apple Executives Are White

An environment with plenty of checks and measures may be necessary so that a single demographic is not overly represented. Perhaps, HR could recruit outside of Silicone Valley or the company could establish a presence in urban centers with a high concentration of demographics lacking in its ranks. Accidents happen, perhaps, engineers just recruit or reference who they know and this leads to a legacy of ‘friends of friends’ joining the company.

For now, the majority of Apple Execs are white: 80 percent. The rest are Asian, with 2 Hispanic individuals and 1 lone Black person.

Apple is composed of a network of likeminded individuals. Unsurprisingly, their leadership style and composition reflects this closed world.

Tokenizing Text, Finding Word Frequencies Within Corpora

One way to think about tokenization is to consider it as finding the smallest possible unit of analysis for computational linguistics tasks. As such, we can think of tokenization as among the first steps (along with normalization) in the average NLP pipeline or computational linguistics analysis. This process helps break down text into a manner interpretable for the computer and analyst.

NLTK Is Very Comprehensive

NLTK is likely the best place to start for both understanding and customizing NLP pipelines. Please review their documentation on tokenization here: NLTK – Tokenization Example. While I recommend reviewing NLTK, you should also keep up with Engineers who mostly use Tensorflow. Yes, you must learn two packages at once if you are a linguist in the IT industry.

Learn TensorFlow

“Learn TensorFlow. Given the effort you will place into learning how to combine computing with linguistics, you are also in a strange way lightening the load by proceeding in parallel with industry trends. Likely, a popular topic will contain ample documentation. Consider that most engineers will have a frame of reference for tokenization that is not necessarily grounded in Linguistics, but instead based on interactions with industry centric examples with an intent to prepare data for Machine Learning.”

Industry Perceptions

Thus, if you do not know tokenization both in terms of how engineers perceive tokenization and how linguists work with the concept, then you will likely be perceived as not only not knowing how to program, but also not knowing about your own subject matter as a linguist. While this is obviously not true, perception matters so you must make the effort to reach engineers at their level when collaborating.

# -*- coding: utf-8 -*-
"""
Created on Fri Sep 10 23:53:10 2021

@author: Ricardo Lezama
"""
import tensorflow as tf 

text = """ 
    A list comprehension is a syntactic construct available in
    some programming languages for creating a list based on existing lists. 
    It follows the form of the mathematical set-builder notation (set comprehension) as
    distinct from the use of map and filter functions.
    """
 
content = tf.keras.preprocessing.text.text_to_word_sequence(text)

Obviously, we do not want to repeat the one liner above over and over again within our individual python script. Thus, we neatly repackage the one line as the main line in a function titled ‘tokenize_lacartita‘ as follows:



def tokenize_lacartita(text):
    """ open_lacartita_data references a function  to open our txt documents. 
    
    Arg: This should be a text string that will be converted into a list with individual tokens.
          ex.) ['This', 'is', 'text', 'tokenized', 'by', 'tensorflow']
    
    Returns: New line separated documents. 
    """
    keras_tok  = tf.keras.preprocessing.text.text_to_word_sequence(text)
    return keras_tok

The data we will receive for using this tokenization module is shown below. As you can see, there are individual strings, lowercased and no punctuation as this is by default eliminated in the tokenization process.

['morelia',
 'apoyar',
 'no',
 'es',
 'delinquir',
 'señalan',
 'grupos',
 'feministas',
 'a',
 'sheinbaum',
 'capacitan',
 'a',
 'personal',
 'de',
 'la',
 'fiscalía',
 'cdmx']

Word Frequency and Relative Percentage

We can create a function to find word frequencies. Granted, the counter module in Python can do this already, but, for educational purposes, we include a function to track a word’s frequency within a list. The if-condition below can permit us to count whenever we see our target word within the word list. In this case, we examine a series of headlines related to Mexico that were gathered and classified by hand by Mexican University students.

def word_frequency(word, word_list):
    """
    Function that counts word frequencies.

    Arg: target word to count within word list.
    
    Return: return a count. 
    """
    count = 0
    for word in word_list:
        if word == target_word:
            count += 1
    return count

The word_frequency function receives “AMLO” or it’s normalized version: ‘amlo’ alongside the word list as the second argument. The frequency of the string is listed next to the term when it is returned. Obviously, you can add more elaborate details to the body of the function.

word_frequency("amlo", saca)
Out[164]: 'amlo: 12'

Tokenization In Native Python

At times, an individual contributor must know how to not write a custom function or invoke tokenization from a complex module with heavy libraries. There may be times that linguists are working within siloed environments. This implies that you would not have write privileges to install libraries, like TensorFlow, in a generic linux environment. In this cases, use native python – the term references built in functions or modules that require no additional installation outside of having the most updated version of python.

In fact, you may indeed need to rely more on the raw text you are attempting to tokenize. At times, there are different orthographic marks that are relevant and necessary to find. For example, were you to split based on a space, ” “, or a period, “.”, you can do so by calling the split attribute.

def word(text): 
    return text.split(" ")

All strings contain a split attribute that you can invoke for free in general. Furthermore, you can run a method called ‘strip’ and cleanout a lot of whitespaces. Please see the examples below.

def sentence(text): 
    text_strip = text.strip()
    return text_strip.split(" ")

Getting Started With The Command Line

First of all, what is a ‘command line’? Visually, within a Microsoft Operating System, the command line looks like this:

Pictured is a command line. You can access this by searching ‘CMD’ on your windows search bar.

You can access the command line by typing ‘CMD’ on your Windows Desktop.

As the name implies, a command line is an interface where a user inputs literal commands to accomplish a computing task. The same or similar amount of tasks are often possible via mouse clicks, keyboard commands outside of the CMD screen, etc. Prior to the mouse, the command line was the primary mode of interacting with a computer in the 1970’s prior to the invention of the mouse, point and click method for accessing files in a computer. You may reasonably ask: ‘If people can accomplish basic computing tasks with a simple mouse click and scroll of the screen, then why would anyone use the command line’? The answer is that in modern computing, heavier or more complex tasks can be accomplished more easily by providing specific instructions that can not easily be accomplished with mouse clicks.

Let’s explore a simple and commonly used command to just get started in the command line.

The MV command

The MV command, an often used command for server administrators and grunt level programmers (like me), can help move files around and between computers. Think of programming at this level as steps or tasks accomplished. Individually, this command seems insignificant, but often times, commands are used in harmony with other commands.

Like I alluded to above in the introduction, there is some parity between the tasks accomplished with a mouse and the command line. They are both a type of interface with the computer/server.

For example, one could drag and drop a series of files, one by one, into a different folder based on some criteria. However, let’s say you have a large set of files with some common denominator in terms of text or naming conventions. While you could continue to drag and drop, there are moments in which there are simply too many files to easily view within a screen.

This simple command called ‘mv’ or short for move permits you to move files within a command line interface to other directories.

mv FILENAME somedirectory/

In general, first you type the command type, in this case ‘mv‘, then to the right of that command you specify which files you will move.

mv FILENAME ZIP/

Below is a screenshot of a real world example, a screenshot of a directory that contains these files and the directory where I want to place these files. You may have noticed that there is a ‘*‘ symbol right before zip and then the directory ZIP.

The ‘*‘ symbol is called the Kleene star or wildcard character. The wildcard character matches any character and any number of characters simultaneously. This * character tells the computer to search for a filename with any number and type of strings prior to the string I specified, ‘zip’. It will therefore move all the zip files into the ZIP/ directory.

‘MV’ command where we specify the movement of zip files into ZIP

Here is the literal codeblock:

mv *zip ZIP/

As you can see, the file names that can be moved are ‘g2p-seq2seq-master.zip’, ‘spa-eng.zip’, ‘spanish_g2p.zip’ and ‘NER_news-main.zip’. Its a lot a easier to just type *zip and go about your business that away. Imagine if instead of 4 files it was 400 files you needed to move. In that scenario, therein lies the utility of such a simple command that can “catch-all” the filenames.

Please note, you can also type the file names individually. Here is an example:

mv 'g2p-seq2seq-master.zip' 'spa-eng.zip' 'spanish_g2p.zip' 'NER_news-main.zip' ZIP/

Hopefully, this brief overview of how to use the ‘MV’ command is helpful. Feel free to reach out at lezama@lacartita.com with any questions. Thanks!

Leveraging NVIDIA Downloads

An issue during the installation of TensorFlow in the Anaconda Python environment is an error message citing the lack of a DLL file. Logically, you will also receive the same error for invoking any Spacy language models, which need TensorFlow installed properly.

Thus, running the code below will invoke an error message without the proper dependencies installed:

import spacy
import spacy.attrs
nlp = spacy.load('es_core_news_sm')

The error message below will appear if the NVIDIA GPU Developer kit is not installed:

"W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found"
"I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine."

The issue is the lack of a GPU developer kit from NVIDIA.

CUDA Toolkit 11.4 Update 1 Downloads | NVIDIA Developer