Unveiling Elite Networks Using Large Language Models (LLMs)
Abstract
This paper is the methodology portion of a larger project that investigates how elite networks in autocracies shape the constraints against the leader as well as implications for autocracies’ foreign and security policies. Social relationships and interactions are central factors shaping dynamics of elite politics, which in turn have important social, economic, and political implications. Nevertheless, the study of elite networks and factional dynamics has been limited to separate countries and/or snapshots of time. This gap is understandable: the amount of work required for collecting information on elites’ friends and rivals is simply too immense, especially for data collection across time and space. To address this problem, this project leverages the latest developments in Large Language Models (LLMs) to conduct information extraction (IE) tasks which allows for the construction of networks of elites across government cabinets from 1990 to 2015. Specifically, using both manually collected data & teaching data from larger models like Llama 3.3-70B and GPT-4, I fine-tune Llama 3.1-8B to detect and extract relevant details from each elite’s biographical records (e.g., schools, job titles, organizations, job timeframe, and family members, and colleagues).The data collected by the finetuned model is then enhanced by human correction, before being used to build elite networks based on overlaps among elites’ working histories and family ties. The project adds to growing recent efforts of elite data collection, offering a powerful tool for capturing information about both the backgrounds and relationships of elites. More broadly, this LLM-based framework enables large-scale data collection applicable to a wide range of research topics beyond elite politics, with relatively high accuracy and notably reduced resources.
