WebScraping As Sourcing Technique For NLP

Introduction In this post, we provide a series of web scraping examples and reference for people looking to bootstrap text for a language model. The advantage is that a greater number of spoken speech domains could be covered. Newer vocabulary or possibly very common slang is picked up through this method since most corporate language […]

New B.1.1.529 Coronavirus Variant Poised To Be Deadlier Than Delta

Markets, medical experts and governments are raising concerns over the latest Coronavirus variant. As the public in the United States gathers in observance of Thanksgiving, South African experts and global governments are alarmed at the B.1.1.529 variant of the Coronavirus. Enough concern has been raised to shutdown air-traffic partially between UK/Europe and parts of Africa. The […]

Another Zero Day Exploit For Microsoft

Even Windows 11 is affected. Apparently, one can open a command line window and deploy an exploit to raise permissions on a machine using a .exe file freely available on Github. Nice. The exploit works on Windows 10, Windows 11 and Windows Server versions of this OS. The exploit consists of a low privileged user […]

Word2Vec Mexican Spanish Model: Lyrics, News Documents

A Corpus That Contains Colloquial Lyrics & News Documents For Mexican Spanish This experimental dataset was developed by 4 Social Science specialists and one industry expert, myself, with different samples from Mexico specific news texts and normalized song lyrics. The intent is to understand how small, phrase level constituents will interact with larger, editorialized style […]

Mikey Garcia – A Manny Pacquiao Style Loss

The Sandor Martin upset is reminiscent of the great Manny “Pacman” Pacquiao upset against Jeff Horn. In Fresno, California, Mikey Garcia delivered a slow and methodical performance against Sandor Martin. For his part, Martin delivered on the expected southpaw style, consistent jab and constant lateral movement – this proved more valuable to the California judges […]

Semantic Similarity & Visualizing Word Vectors

Introduction: Two Views On Semantic Similarity In Linguistics and Philosophy of Language, there are various methods and views on how to best describe and justify semantic similarity. This tutorial will be taken as a chance to lightly touch upon very basic ideas in Linguistics. We will introduce in a very broad sense the original concept […]

Shiba Coin Surge Explained: 4 Reasons Why General Market Trends Favor This Latest Surge

In the past 72 hours, the cutely named cryptocurrency, Shiba Inu Coin, an offshoot of the DOGE coin, overtook Bitcoin & Ethereum as the hottest coin in the market. It’s risen at times to nearly 300 percent of it’s original value. Here’s some possible reasons why: Underbanked see new road: Many people find it more […]

Survey: US Based Mexican Average Salary Between 47k to 67k Annually

From May 2nd to May 5th of 2020, I gathered data with an online survey administered with the help of Mexican based Data Analysts who helped me recruit participants and review the data. My goal was to understand how COVID-19 affected my community’s economic status and employment prospects. The descriptions here apply to 77 confirmed […]

Demographics of Apple Executives

Most Decision Makers Are White EEOD DATA Indicates 99 out of 123 Execs Are White. Graphic of @Apple executives based on most recent EEOD data Recently, the National Labor Relations Board sued Apple. According to the suit, Apple has fired employees in a retaliatory fashion. The employee in question had alleged Sexual Harassment by a […]

Tokenizing Text, Finding Word Frequencies Within Corpora

One way to think about tokenization is to consider it as finding the smallest possible unit of analysis for computational linguistics tasks. As such, we can think of tokenization as among the first steps (along with normalization) in the average NLP pipeline or computational linguistics analysis. This process helps break down text into a manner […]

Getting Started With The Command Line

First of all, what is a ‘command line’? Visually, within a Microsoft Operating System, the command line looks like this: You can access the command line by typing ‘CMD’ on your Windows Desktop. As the name implies, a command line is an interface where a user inputs literal commands to accomplish a computing task. The […]

Frequency Counts For Named Entities Using Spacy/Python Over MX Spanish News Text

On this post, we review some straightforward code written in python that allows a user to process text and retrieve named entities alongside their numerical counts. The main dependencies are Spacy, a small compact version of their Spanish language model built for Named Entity Recognition and the tabular data processing library, Matplotlib, if you’re looking […]

Guía de ensamblaje e instalación de OS en Raspberry pi 4 model B.

En esta guía vamos a mostrar paso a paso el ensamblado e instalación del sistema operativo para una Raspberry pi 4 model B (8gb ram).

N-Gram Analysis Over Sensitive Topics Corpus

I was recently able to do some analysis over the Sugar Bear AI violence corpus, a collection of documents classified by analysts over at the SugarBear AI group. The group has been classifying manually thousands of documents of Mexican Spanish news over the past year that deal with the new topics of today: “Coronavirus”, “WFH”, […]

Sugar Ray Leonard: ‘Canelo Has My Vote As P4P’

In the Ak & Barak show, Sugar Ray Leonard labeled Canelo as the best pound for pound fighter in the world. Sugar Ray recalled his bouts against Roberto Duran and Marvin ‘Marvelous’ Hagler’s boxing ability. Leonard empathized with Hagler’s inability to tolerate his loss against him. He also praised the abilities of the main names […]

Elderly Asian Woman Robbed, Beaten In Oakland, California

Betty Yu on Instagram: “OAKLAND GRANDMOTHER ROBBED A 71-year-old woman who just left the bank is seen knocked to the ground and robbed near 6th/International on…” CBS Oakland reporter, Betty Yu, posted on her Instagram account disturbing footage of a group of males who robbed and beat an elderly Asian woman as she had exited […]

Southern California: Chicanos Express Outrage Over Attacks On Asian Elderly

Throughout the day, Chicano communities have expressed outrage over violence against Asians stemming from false narratives around the pandemic. In Southern California, Chicano sentiment expresses solidarity with Asian community. This could be observed through social media postings from influential people within the community, like El Indio Botanas & Cervezas. Due to the vast amounts of […]

Whose Getting Covid-19 Fatigue?

There are a few ways to interpret the data we have amassed under the web application in Chicano Press. Today, we try to compare the aggregate infection counts to prior months. We hope to infer whose in some sense improving relative to where they were the month before. I’m sure there is a more sophisticated […]

Linguistics In The Enterprise

Why Linguistics (And Linguists) Are Always On The Back-Foot In An Enterprise Context Linguistics is often questioned by practitioners of Natural Sciences during informal and professional scenarios. Perhaps, this is the case due to the fact that the phenomena relevant to the Natural Sciences is more readily observable through instrumental means. The irony is that […]

Wearables, Speech Recognition & Musk: How Intel’s Loss Could Be Tesla Gain

Despite the famously late arrival to mobile computing, Intel did make certain strides before many others in the space of wearables in Mid-2013 and onwards. Much of it may have to do with the company’s strategic diversification which took place in mid-2013. Hundreds of Millions Poured Into Research & Development Intel invested at the very […]

Introducción a Python

Python es un lenguaje de programación bastante flexible y veloz cuando seconsidera que es un lenguaje de alto nivel. En este breve resumen del idioma se presenta un ejercicio que abre un archivo. Esta tarea es casi rutinaria en todo trabajo complejo. Ahora, si buscas un ejercicio más avanzado o uno que simplemente abarca un tema […]

Intro To Linguistic Theory

Linguistics is the science of language as it relates to human cognition. Metaphysical considerations on the properties of organic systems may seem far removed from the lower level details of language data, but the general idea is that the language faculty is ‘perfect’, has nearly exact properties that are recurrent and while not wholly describable […]

After Success Unifying Super Middleweight Division, Canelo Calculates Legacy With Cruiserweight Challenge

Saul “Canelo” Alvarez is now contemplating challenging a much bigger man who is the champion at the Cruiserweight weight division in the WBC. The Canelo legacy keeps rising as the 31 year old Mexican enters his prime and relishes success compounded repeatedly after multiple successful defenses of titles. Most recently, the Mexican has unified the […]

Canelo vs Plant Is Finally Here

Canelo is now set to face his last and potentially most difficult fight for Super Middleweight supremacy: Caleb Plant. The Super Middleweight unification bout is set to kick off tomorrow at around 6pm PT from Las Vegas, Nevada. It’s at 75 dollars, which is not terrible alongside a decent undercard. Already, Canelo is a four-division […]