1. Context
To be eligible for inclusion in the S&P 500 index, a company should be a U.S. company, have a market capitalization of at least $11.8 billion, be highly liquid and have a public float of at least 10% of its shares outstanding. Additionally, its most recent quarter’s earnings and the sum of its trailing four consecutive quarters earnings must be positive. The S&P doesn’t have any specific diversity-and-inclusion requirements and there is no compulsory requirement by the SEC for companies to disclose this data at the board level so far. This causes heterogeneity in data disclosure: not all companies disclose the race & ethnicity of their board members and among companies that do, there is no structured way of doing so either.
2. Task
The aim of the current task is to get insights into the racial and ethnic diversity of the board members of a subset of companies of the S&P. In order to do this the information should be extracted from their D&I reports using Python regular expressions. A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. The search pattern is frequently composed of different semantic groups. RegEx can be used to check if a string contains the specified search pattern (useful information can be found here). If the RegEx has been properly built, its output will be a paragraph where it is most likely to find the information we are looking for. It could be the case that the final number we are looking for has to be manually extracted from this output paragraph. A Jupyter notebook with a complete coding example and a folder with the corresponding D&I reports in pdf format have been provided for this task.
As you will be able to see, when looking for the race and ethnicity of board members on company reports there is no rule of thumb for disclosure and data (sentences) can be structured in different manners. A common semantic structure to disclose this information could include the following semantic groups:
A: An expression (integer or string) indicating number or percentage of directors such as “2 out of 10” or “20%”.
B: A word/phrase indicating the subject such as “director” or “board member”.
C: A word/phrase referring to racial and ethnic diversity such as “diverse”, “ethnic minorities”, “Black”, “Latin”, etc.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme