Xinran (Katherine) Zhang

Xinran (Katherine) Zhang

Aspiring Data Scientist

University of Sydney

Hi there! This is Xinran Zhang. I am currently a third-year undergraduate student at University of Sydney, majoring in Statistics and Financial Mathematics. I aspire to be a Data Scientist who explore the world through the quantitative lens and capture patterns out of mess in life. I look forward to applying what I’ve learned to make a positive impact on the world.

I have experience working/researching in the field of Data Science, Optimization, Time Series, and Computational Systems Biology. I enjoy solving real-life problems with computational methods and finding insights in datasets. My research interests include machine learning, neural network, and their applications to large-scale genomic data.

Hobbies: Photography, Musical, Piano, Travel (See my photographs at the Gallery section)

Skills

R
Python
Photography

Education

 
 
 
 
 
BSc in Statistics and Financial Mathematics
University of Sydney
August 2020 – Present Sydney

Weighted Average Mark (WAM): 90.5/100

Activities and societies: Sydney University Data Society; Sydney University Musical Theatre Ensemble; Sydney University Movement and Dance Society

Awards: Charles Perkins Centre Summer Research Scholarship; University of Sydney Academic Merit Prize; Beta Gamma Sigma lifetime member; Student Exchange Travel Scholarship; Vice Chancellor’s Global Mobility Scholarship

 
 
 
 
 
Exchange Student, concentration in Statistics and Data Science
University of Pennsylvania
January 2022 – June 2022 Philadelphia

GPA: 3.94/4.00

Activities and societies: Penn Data Science Group; Wharton Women; Penn Quakers Venture Club

Experience

 
 
 
 
 
Student Researcher
September 2022 – Present Sydney
  • Supervised by Professor Pengyi Yang and Dr Hani Jieun Kim
  • Project: A Data Science Approach to Investigate Human Development
 
 
 
 
 
Financial Mathematics Capstone Project
University of Sydney
July 2022 – November 2022 Sydney
  • Supervised by Dr Zhou Zhou and Dr Lindon Roberts
  • Project 1: Monte Carlo Method in Option Pricing
  • Project 2: Portfolio Optimization with Market Data
 
 
 
 
 
Investment Research Intern
August 2021 – January 2022 Shanghai
  • Industry research: conducted research of industry trend, market share, competitive analysis regarding autonomous cars, GPU, DPU, and semiconductors, positively influenced the investment decisions of technology group
  • Financing materials: output 3 pitchbooks independently in the fields of millimeter-wave radar and GPU, built storylines and drew bilingual slides, assisted in financing of billions of CNY
  • Multitasking: assisted in roadshows, due diligence, and senior executive interviews, wrote meeting minutes, greatly enhanced team productivity
 
 
 
 
 
Data Analysis Intern
June 2021 – August 2021 Hangzhou
  • SQL: programmed 1,500+ lines of SQL to select data and retrieve key information for 12 core data indicators regarding sales performance and service quality, processed 10,000+ rows of raw data every day
  • Python: imported 1300+ order data from third-party API, aligned data caliber, extracted data from Hive data warehouse and automated data input from a third-party sales platform into our own system
  • Report and presentation: presented the product department 2 user portrait and A/B test reports, which compared efficiency between dispatch mode and default mode and analyzed significant decline of conversion rate
  • Collaboration: helped complete data visualization objective 15 days ahead of schedule and summarized working experience and instructions into a document for incoming interns
 
 
 
 
 
Digital Transformation Project Assistant
Accenture
April 2021 – June 2021 Shanghai
  • Excel: collected and organized data of different public cloud platforms infrastructure from company research of AWS, Microsoft Azure, Alibaba Cloud and Huawei Cloud by using advanced Excel functions
  • Tableau: visualized data using Tableau, produced pie charts and heatmap of companies’ market share, bar charts of cloud resource capacity, compared companies’ production workloads, service availability, risk control using pie charts
  • Meticulousness: drew slides, wrote meeting minutes, and ensured all client-side documents had no mistakes
 
 
 
 
 
Volunteer
The Red Cross
September 2018 – Present Multi-site
  • Served as a volunteer teacher for 4 months in Shanxi Province, kept correspondence with those children after the teaching
  • Gave speeches in 9 schools and 15 neighborhoods to collect donations to poverty-stricken and disaster-affected areas
  • Volunteered as an assistant for 30+ hours on mobile blood donation and collecting vehicles

Projects

*
Portfolio Optimization with Market Data
Utilized monthly return data of 8 assets, optimized static portfolio and dynamic portfolio, explored different situations with alternative risk measures and market dynamics.
Monte Carlo Method in Option Pricing
Regarding European and Asian call option pricing, explored methods of variance reduction (antithetic variables, importance sampling, control variates) and generated random paths (Milstein and Euler), compared performance of Monte Carlo method and Crank-Nicolson scheme in Finite Difference method.
Bankruptcy of Pacific Gas and Electric
Fitted an ARMA-GARCH model of simple monthly returns for Pacific Gas and Electric common stock for the period 1998 through 2021, used a GARCH (1,1) model to address the volatility.
Early Social Determinants of Adolescent Wellbeing
Developed a model of adolescent well-being from early social and structural health determinants, used Fragile Families sample which longitudinally follows 5,000 children from birth to 15, utilized model strategies such as regression with LASSO, and random forest with PCA and bagging.
Predict Ratings from Reviews on Yelp
Used 100,000 Yelp reviews and ratings to build a classifier, conducted sentiment analysis, and constructed a word cloud to exhibit words for good and bad reviews respectively. Implemented an 81%+ accurate self-designed Neural Network model to predict ratings from reviews, which had 2 hidden layers with 16 and 8 neurons in each layer, used ReLU as activation function and used Sigmoid as output layer.
Imports by U.S. from China
Fitted a seasonal ARIMA model with calendar trigonometric pairs to analyze trend and changes of US-to-China imports, presented dynamic seasonal structure, performed a residual analysis with the residual acf, pacf, and spectral density.
Readmission Probability of Diabetes Inpatients
Predicted likelihood of diabetes patients being readmitted within 30 days of hospital discharge, applied LASSO to impose a penalty function, induce sparsity and obtain a parsimonious logistic regression model, variables including patient demographics, medical history, and medication details.
Chat Forum Moderator Program
This project is to create a program called moderator.py that moderates a chat forum and a program to test moderator.py. The moderator program is divided into parts involving command line arguments, reading files, writing to files, replacing text, and writing classes.
COVID-19 Mortality Rate
Visualized dynamic evolvement of COVID cases and COVID-related death at state level through spaghetti plots and animated heatmaps, built and selected the best multiple regression model with LASSO and BIC that figured out critical county-level demographic and policy interventions associated with mortality rate in the US.
Gym Members Instruction
This project is to create a program for the local gym to assess their members’ needs and give them exercises based on their situations and goals. The program decides the best exercise for each member and gives them detailed instructions to run the exercise.

Contact