Masters Candidates

MASAI JONAH MUDOGO

MASAI JONAH MUDOGO

Jonah Mudogo Masai is an ambitious, highly motivated, hardworking, confident and passionate actuarial science graduate with remarkable degree of adaptability and a team player. Having worked in Directline Assurance company and currently in treasury department in one of the leading banks in Kenya, it's crystal clear that with resilience and constant believe in God, everything is achievable and that fortune favors the brave. In addition, Jonah is well conversant with R- programming language and looks forward to further his studies in the same field since current life dynamics requires ideal, astute and timely risk mitigating measures which is core in Actuarial Science.

Project Summary

ABSTRACT

MODELLING TIME TO DEFAULT ON KENYAN BANK LOANS USING NON PARAMETRIC MODELS

Financial institutions in past decades have been facing many risks that must be dealt with sensitively and in accordance with the instructions of the Central Bank of Kenya (CBK). In the forefront of these risks is credit risk which in case is ignored would likely plunge the banks into myriads of problems or even to bankruptcy. Papers on statistical models detailing on how to model credit risks have been published and have enabled banks to differentiate ’good’ and ’bad’ clients contingent on repayment performance during loan term. Credit granting is one of the main ingredients required for an economic spur in any given country. However, the technicalities attached to it poses a dilemma to the lending institutions on the appropriate approach to adopt when lending to minimize losses resulting from default. The objective of this research is to identify credit scoring factors and to select non-parametric models of survival analysis which is most effective to model time to default. Variables considered based on FICO include income of the company, age of the company and account. It was evident that oldest companies whose accounts were opened more than 8 years before loan application have lower tendency of default. Also study show that Nelson Aalen is a better estimator of time to default to Kaplan-Meier. The study recommends more studies to incorporate macroeconomic variables to establish their impacts on client’s loan repayment performance and further estimate time to second default. It will also be interesting to extend this studies to the mixture curse model and study the performance of the resulting model in comparison with Cox proportional hazard model with penalized splines as our study involved univariate method.

Links

Macharia Shelmith Wanjiru

Macharia Shelmith Wanjiru

Student Bio

Shelmith Wanjiru Macharia: MSc. Biometry (UoN, 2020) and BSc. Statistics (first class honors, UoN, 2018). Shelmith is a recipient of DELTAS Africa - Sub-Saharan Africa Consortium for Advanced Biostatistics (SSACAB) MSc scholarship. For her thesis, she conducted a simulation study to compare the performance of Multiple Imputation Chained Equations (MICE) and K Nearest Neighbors (KNN) based imputation on missing Demographic Health Survey data under the guidance of Dr. Timothy Kamanu (UoN) and Mr. Paul Mwaniki (KEMRI Wellcome Trust Programme). Shelmith works as a data scientist at AJUA (formerly mSurvey). She is passionate about data science, machine learning and statistics. She is an R and Python enthusiast, and a speaker at R-Ladies Nairobi where she recently gave a talk on text mining in R. Shelmith volunteers as a judge at data science technical review bench, Moringa school and as a facilitator at various data science meet ups in Nairobi. Her hobbies are travelling and hiking.

Project Summary

Project Title:

Comparing the performance of Multiple Imputation Chained Equations (MICE) and K Nearest Neighbors (KNN) based imputation on missing Demographic Health Survey (DHS) data: A simulation study

Abstract

Background: Missing data is a common problem in Kenya Demographic Health Survey (KDHS) datasets. Complete case analysis is often used to deal with missingness which may lead to bias. Multiple imputation chained equations (MICE) allows for uncertainty in imputations and often uses linear models to impute missing values. Linear models may give biased estimates when a strict linear relationship does not exist between the response variable and covariates. Machine learning based imputation methods are an alternative to MICE since they do not require a linear relationship between the response variable and covariates but may underestimate the standard errors of estimates. This study seeks to compare parameters obtained from MICE and KNN based imputation as well as test whether KNN based imputation underestimates standard errors of estimates. Methods: Missing values in the KDHS 2014 dataset about children were substituted with medians and modes to obtain a complete dataset. Missingness was introduced to the complete dataset in different proportions of missingness. A hundred simulated datasets for each proportion of missingness were obtained. MICE and KNN based imputation were applied to each simulated dataset. Logistic regression models were fitted to the complete and imputed datasets. Differences between regression parameters from complete and imputed datasets were used to evaluate the performance of MICE and KNN based imputation. Results: KNN based imputation performed better than MICE. KNN based imputation underestimated standard errors in a number of cases. The proportion of missingness did not have an effect on the results obtained from imputed datasets. Conclusion: This study recommends KNN based imputation as a method that deserves further consideration in future.

 

 

Links

Masese Omaanya Victor

Masese  Omaanya Victor

Student Bio

Victor Masese is a masters student at the University of Nairobi majoring in Actuarial Science. His research focuses on the application of Generalized linear models in pricing usage-based insurance; an idea that was enhanced while working at Heritage Insurance as an Actuarial intern. He holds a bachelor’s degree in Actuarial Science from Jomo Kenyatta University of Agriculture and Technology. He hopes to extend his knowledge and skills into the Insurance industry.

Project Summary

Project Title:

Application of Generalized Linear Models in Pricing Usage-Based Insurance

Abstract

Technological advancements and big data adaptations are broadly impacting the insurance industry. Usage Based Insurance (UBI) is a result of the emerging technologies and big data adaptation as it is based on telematics data which is captured and relayed in real-time by telematics devices installed in insured cars. Consequently, data transmitted to insurers regarding policyholders by telematics devices is increasing at an exponential rate thus necessitating big data adaptation. Actuaries use Generalized Linear Models (GLMs) as the insurance industry’s standard method for determining auto premium rates. This study focuses on estimating the impact of how and when driving is done on premium rates charged by applying Generalized Linear Models (GLMs). The factors considered in analysis include average speed, distance driven and time of day. From the insurance portfolio analyzed, the pure premium increases with an increase in the speed and distance covered while driving during the day more than night decreases the pure premium. These findings are representative of auto insurance policies covered but do not represent a generalized trend. Results of this study can help auto insurance industries to evaluate the risk of driving more precisely and come up with personalized premiums for drivers based on their driving behavior.

 

 

Links

OKELLO ERICK OTIENO

OKELLO ERICK OTIENO

Student Bio

Erick is a Statistician with a keen interest survival analysis, linear mixed effects models and joint modelling of longitudinal and time to event data. It was while pursuing his undergraduate degree at Maseno University-Bsc Applied Statistics that he developed an interest in statistics. This, coupled with the market need for individuals who were skilled in statistical techniques for analysing data, motivated him to pursue his M.Sc. in Biometry at the University of Nairobi. Through this learning experience, he gained valuable skills in advanced data analysis methods. Currently, Erick is a Statistical Programmer at Phastar, a global CRO that provides statistical consulting and clinical trial data collection and reporting services to pharmaceutical and biotech companies. Previously, Erick was a Data Management Officer with NRHS providing end to end technical skills in HIV projects funded by multiple donors like CDC, UIC, PEPFAR. Erick is a firm believer in utilizing quality, timely and accurate data for decision making and policy development.

Project Summary

Project Title: Joint Modelling of CD4 Count and Time To Wound Healing in HIV-Positive Men Following Circumcision

Abstract

Background: Modelling longitudinal information and event time outcomes simultaneously helps in describing the progression of the disease over time. Past studies have mostly applied standard Cox proportional hazards model to establish the association between baseline CD4 count and time to wound healing following circumcision. However, Cox proportional hazards model does not take into account the special features of biomarkers besides not utilizing the entire longitudinal history of measurements. Consequently, results reported from Cox proportional hazards model could be biased or inefficient. To optimally investigate the association between CD4 count and time to wound healing, we used a joint modelling framework. In this framework, we utilized patients’ entire longitudinal history of CD4 count, while also properly accounting for measurement error caused by biological variation and missing measurements. Methods: In the first step, we fitted a linear mixed effects model to describe the evolution of square root CD4 count over time for each patient while adjusting for the priori selected baseline covariates. In the second step, we used the estimated evolution (square root CD4 count) in the Cox proportional hazards model to determine its relationship with time to wound healing. Some CD4 count values were missing for some patients at follow-up visits. This is a missing data problem synonymous with longitudinal studies and we assumed that the mechanism of missingness was missing at random (MAR), and thus, the results reported from the joint models, are still valid under MAR. Results: 115 out 119 patients completed their follow-up visits and their wounds were certified fully healed. Median time to wound healing was 49 days (IQR:49-63 ). There was no association between the current true value of square root CD4 count and wound healing time (p-value=0.536). However, for patients with the same current true value of square root CD4 count at a given time point t, the log hazard ratio for a unit increase in the rate of change in square root CD4 count trajectory was 1.514 (95% CI: 1.121; 1.908). Conclusion: Circumcising HIV-positive patients with any level of square root CD4 count is not harmful to their post-circumcision wound healing. However, patients with the same current true level of square root CD4 count could exhibit different slopes of the square root CD4 count trajectory at the same time point t, leading to different progression of wound healing between them.

Links

OKELLO ERICK OTIENO

OKELLO ERICK OTIENO

Student Bio

Erick is a Statistician with a keen interest survival analysis, linear mixed effects models and joint modelling of longitudinal and time to event data. It was while pursuing his undergraduate degree at Maseno University-Bsc Applied Statistics that he developed an interest in statistics. This, coupled with the market need for individuals who were skilled in statistical techniques for analysing data, motivated him to pursue his M.Sc. in Biometry at the University of Nairobi. Through this learning experience, he gained valuable skills in advanced data analysis methods. Currently, Erick is a Statistical Programmer at Phastar, a global CRO that provides statistical consulting and clinical trial data collection and reporting services to pharmaceutical and biotech companies. Previously, Erick was a Data Management Officer with NRHS providing end to end technical skills in HIV projects funded by multiple donors like CDC, UIC, PEPFAR. Erick is a firm believer in utilizing quality, timely and accurate data for decision making and policy development.

Project Summary

Project Title: Joint Modelling of CD4 Count and Time To Wound Healing in HIV-Positive Men Following Circumcision

Abstract

Background: Modelling longitudinal information and event time outcomes simultaneously helps in describing the progression of the disease over time. Past studies have mostly applied standard Cox proportional hazards model to establish the association between baseline CD4 count and time to wound healing following circumcision. However, Cox proportional hazards model does not take into account the special features of biomarkers besides not utilizing the entire longitudinal history of measurements. Consequently, results reported from Cox proportional hazards model could be biased or inefficient. To optimally investigate the association between CD4 count and time to wound healing, we used a joint modelling framework. In this framework, we utilized patients’ entire longitudinal history of CD4 count, while also properly accounting for measurement error caused by biological variation and missing measurements. Methods: In the first step, we fitted a linear mixed effects model to describe the evolution of square root CD4 count over time for each patient while adjusting for the priori selected baseline covariates. In the second step, we used the estimated evolution (square root CD4 count) in the Cox proportional hazards model to determine its relationship with time to wound healing. Some CD4 count values were missing for some patients at follow-up visits. This is a missing data problem synonymous with longitudinal studies and we assumed that the mechanism of missingness was missing at random (MAR), and thus, the results reported from the joint models, are still valid under MAR. Results: 115 out 119 patients completed their follow-up visits and their wounds were certified fully healed. Median time to wound healing was 49 days (IQR:49-63 ). There was no association between the current true value of square root CD4 count and wound healing time (p-value=0.536). However, for patients with the same current true value of square root CD4 count at a given time point t, the log hazard ratio for a unit increase in the rate of change in square root CD4 count trajectory was 1.514 (95% CI: 1.121; 1.908). Conclusion: Circumcising HIV-positive patients with any level of square root CD4 count is not harmful to their post-circumcision wound healing. However, patients with the same current true level of square root CD4 count could exhibit different slopes of the square root CD4 count trajectory at the same time point t, leading to different progression of wound healing between them.

Links

Ogola Jacob Abuor

Ogola Jacob Abuor

Student Bio

Jacob Abuor Ogola, BSc (Mathematical)- Applied Mathematics major, MSc in Applied Mathematics- study of Integral Equations and their numerical solutions. Areas of interest: ODEs, PDEs, Functional Analysis, Geometry, Numerical Analysis and Fluid Dynamics.

Project Summary

Project Title:Numerical Solutions of Fredholm Integral Equations of the Second Kind.

Project Abstract

This thesis mainly focuses on the Mathematical and Numerical aspects of the Fredholm integral equations of the Second Kind. Due to its wide range of physical applications, we are going to deal with three types of equations namely: differential equations, integral equations and integro-differential equations. Some of the applications of integral equations are heat conducting radiation, elasticity, potential theory and electrostatics. Generally, we define an integral equation where an unknown function occurs under an integral sign. Also integral equations can be classified according to three different dichotomies: 1. Nature of the limits of integration 2. Placement of the unknown function 3. Nature of the known function After the classification of these integral equations we will have to investigate some analytical and numerical methods for solving the Fredholm integral equations of the Second Kind. Analytical methods include: degenerate kernel methods, Adomain decomposition methods and Successive approximation methods and Numerical Methods include: Degenerate kernel methods, Projection methods, Nystrom methods and Spectral methods. The main objective of the thesis is to study Fredholm integral equations of the Second Kind. In chapter 4 we have given the approximate methods (Spectral Methods) to solve these equations Using The Classical Orthogonal polynomials which is the main idea of this thesis where we apply the Spectral Approximation Methods for approximating the Fredholm integral equations of the Second Kind.

Links

Muremba Solomon Ng'ang'a

Muremba Solomon Ng'ang'a

Student Bio

Solomon Ng'ang'a was born in the small town of Limuru. Attended Furaha Primary school from 1995-2003 and later Queen of Apostles Seminary from 2003-2007. Later proceeded to University of Nairobi and studied Bachelor of science (Mathematics) from 2009-2013. In 2018 Solomon enrolled for a Masters of science in Social Statistics at the university of Nairobi and is due to graduate on the 11th Dec 2020. Solomon hopes to use the skills gained to start a consultancy firm and help analyze business trends and offer solutions to different customers.

Project Summary

Project Title: Child mortality differentials in conflict-prone counties in Kenya

Project Abstract:

The objectives of this thesis were two. The first objective was to estimate child mortality rates in conflict-prone counties. The second objective was to estimate the child mortality differentials in these counties based on the socio-economic factors. The study determined that as the education of the mother increases, the mortality of the children decreases. Single and married mothers experienced low child mortality rates while divorced or separated and widowed mothers experienced high child mortality rates. Mothers from urban areas had low child mortality compared to their rural counterparts.

Links

Olonde Peter Okoth

 Olonde Peter Okoth

Student Bio

I am relation officer at cooperative bank of Kenya and apart form this, I'm gumptious scholar who always committed to do a studious research concerning dynamics and mechanics fields as I am interested in these fields of applied mathematics. In my partime, I wage an online pedagogy for some canadian mathematics students, and its if absolutely profound to put into practice the mathematical cognition I have acquired from University of Nairobi. I shall endure to delve deep into research.

Project Summary

Project Title: Control of Mechanical System by Moving Coordinates and Motion in Fluids, By Applying of Additional Forces and Having Coordinates as A Function of Time.

Project Abstract

The thesis is about the study of the control of the mechanical system by moving coordinates and locomotion in a fluid. there is study two essentially different ways of controlling the mechanical system’s motion that is; by applying additional forces and by directly prescribing some of the coordinates as a function of time. Flettner rotor initiates locomotion of mechanical systems in fluid and brings about motion and by changing the position of the mass center gravity or internal mass, the body can then be moved dependently and can be controlled. There is full stabilization realized at any point of space when the mechanical system subjected to circulation. When the mechanical system when subjected to non-holonomic constraints whereby the asymptotic stability appertaining to non-equilibrium location gets debilitated and transformed to non-asymptotic. By action of holonomic restraints possessing feeble non-holonomic, a system can be stabilized to stable non-asymptotic. This thesis also expounds the measure of differential involvement in Control of motion for finite-dimensional lagrangian systems and explains the laws of set-valued force that come from the system's interaction with its environs. Laws of a set-valued conditionally rely on geometric form and entities of kinematics. Due to the habituation, this relationship of forces and entities of kinematic are surveyed in detail. Classically, non-potential unilateral forces are contained by appropriate generalized force directions in the generalized force direction. The dissertation also checks into controllability of bodies dealing with countless or infinite-dimension extension, plunged in fluids with viscosity, and with non-zero vorticity. In particular, we can obtain controllability and stabilization properties for these infinite-measurable extents systems.

Links

Saidi James Kamau

Saidi James Kamau

Student Bio:

Currently aiming at PhD in Bio-statistics. With vast interest in Agricultural and Bio-medical research fields. Enthusiastic and passionate in teaching areas related to Mathematical Statistics. Previously accomplished Msc. in Biometry and Bsc. in Statistics. Once a college tutor and a volunteer secondary school teacher.

Project Summary

Project Title: Modelling Tuberculosis treatment outcomes using a Discrete Time Markov Chain model.

Project Abstract:

Tuberculosis (TB) is a disease affecting mostly the Lungs and can be fatal when not followed and appropriate measures taken to manage its severity and advancement in a population. Despite TB being preventable and curable, approximately 10 million people worldwide get it every year. This study investigated TB management outcome dynamics, the transition probabilities of TB treatment outcomes and predicted future treatment outcomes using Discrete Time Markov Chain Model. The results showed that there was a gradual increase in transition probabilities from the non-absorbing states to cured/dead states, although the proportion of persons transiting to cure were higher than those transiting to death. Further, transition from the non-absorbing states to again non-absorbing states steadily decline from 80.62% in the 1st year to 0 for most of the transition in the 10th year. In the 13th year, the patients were either in cured or dead state. Those lost to follow up (6.11%) were more than those Transferred out (2.47%) and more patients with Extra-Pulmonary TB (10.94%) were dying despite none having a treatment failure and all completing treatment in comparison to those with Pulmonary TB (7.04%). Future research could investigate why the proportion of Extra-pulmonary TB patients who die is higher than those with Pulmonary TB and why more patients are lost to follow-up. Increasing the patients’ follow up period beyond one year would also shade more light on the transiting probabilities of TB treatment outcomes.

Links

Olondo Utshudi Solange

Olondo Utshudi Solange

Student Bio

Olondo Utshudi Solange has great appreciation for mathematics of finance, particularly in actuarial risks managements. Coupled with her interest in statistics and analysis of uncertainties, her undergraduate project (BSc Actuarial Science, 2018) focused on analyzing the Lee carter model for projection of longevity risks. In her master’s project (MSc. Actuarial Science, 2020), she applied the phase-type model to compute actuarial functions for whole life insurance contracts. She anticipates that her contributions will be useful to other researchers and actuarial studies.

Project Summary

Project Title: Using Phase-type Distribution to Estimate Actuarial Functions for Whole Life Insurance Policies

Project Abstract

A whole life insurance policy is a contract between an insurance company and an insured life where the insurer pays an amount of money called the sum assured to the dependents of a policy holder upon death of the insured life. In return, the policy holder pays lump sum or regular premiums to the insurance company to cater for the benefits. Recently there has been increasing interest in mortality modeling and projection due to the unprecedented mortality improvement in the recent years and the consequent adverse financial impact on pension plans and annuity business. Past attempts on mortality projection often underestimated the overall mortality improvement. There is therefore need to develop a mortality model that: 1. Fits Mortality Data, 2. Takes into account the biological aging process and can flexibly be adjusted to incorporate medical opinion, 3. Can be used for computation of actuarial functions. This paper proposes the use of phase-type distribution to describe a physiological aging process of a human body and uses the model to compute assurances, annuities, and premiums. The one absorbing state of the Markov process is death while the transient states are the ages. The initial probability that the process starts at phase x for a life aged x entering into contract is 1 while it is 0 for other phases. Results showed that annuities are overpriced while assurances are underpriced. The deviation is even greater for policyholders entering contracts when aged between 40 to 80 years. However, there is small deviations for premium.

Links

Musili Faith Mueni

Musili Faith Mueni

Student Bio

Faith Musili holds an MSc. in Social statistics from the University of Nairobi and a BSc. in Geomatic Engineering and G.I.S from Dedan Kimathi University of Technology. Her education combines remote sensing, GIS and data science skills set. She is passionate about data science, machine learning, spatial modelling, shiny dashboard development, spatial mapping and remote sensing. She is also a R programming and open source software’s enthusiast. She is a co-organizer of R-Ladies Nairobi chapter which aims to bring diversity and create a networking platform for R users in Nairobi. Previously, Faith was a Junior data scientist and developer at the World Agroforestry center (ICRAF). Currently, she is a Senior data analyst at the Norwegian Refugee Council (NRC) regional office in Nairobi. Her masters project was on poverty-based classification of households using K-means and K-medoid clustering algorithms under the guidance of Dr. Timothy Kamanu.

Project Summary

Project Title:Poverty-based classification of households using cluster analysis.

Project Abstract

Poverty in rural areas is complex and multi-dimensional. Most of the poor households in Sub-Saharan Africa (SSA) rely on agriculture for livelihood. Agri-climatic shocks such as prolonged droughts, outbreak of animal and human diseases and crop and pest diseases make rural poor households in SSA vulnerable. Research gaps exist on poverty-based clusters in Kenya rural areas. The clusters would be fundamental in understanding the determinants of poverty. This study uses K-means and K-medoid algorithms to identify poverty-based clusters in Kenya rural areas. The data used is collected from rural farming households. K-means and K-medoid algorithms are the most common clustering algorithms used and have been implemented by researchers. The results show that rural poor households have low education level, high dependency ratio, low gender parity ratio, low income and low household diet diversity compared to rural non-poor households. Rural non-poor households own agricultural productive assets, seek extension services, are more aware of financial services and products available to farmers and access financial services more compared to rural poor households. Knowledge on the determinants of poverty in Kenya rural areas can be used by the government, institutions and partners, to formulate strategies and policies in an effort to reduce poverty. In future, research should be conducted on the role of land sizes and land tenure on poverty in rural Kenya.

 

Links

Onchwati Felisters kerubo

Onchwati Felisters kerubo

Student Bio

I am teacher by profession teaching Mathematics and geography from form one to form four. I also enjoy guiding and counselling of the young ones to help them make right decisions in life. I hold a bachelors degree in education Arts from Moi university and waiting to be awarded a masters degree in Biometry from the university of Nairobi. My life philosophy is,"your dreams are valid no matter when provided they are not born dead or buried alive."

Project Summary

Project Title:Modelling The Cure Rates of Female Sex Workers With STIs using The Mover-Stayer Markov Chain model.

Project Abstract

Generally human beings undergo different types of diseases. One such are sexually transmitted infections(STIs)which are either chronic or curable. In this study, a Markov model, specically a Mover-Stayer Markov Chain model is used to determine the cure rates of female sex workers who have chronic and curable STIS. The maximum likelihood estimation method is used to obtain the proportion of stayers and movers in the Mover-Stayer Markov Chain model. Octave computational software is used to analysis and data was obtained from the Kenyan Ministry of Health. The study found that 100% of female sex workers who contract chronic STIs do not get healed while 78.78% of female sex workers who contract STIs eventually get healed. It is recommended that further research be conducted on interventions that can prevent female sex workers from contracting chronic STIs as well as interventions that drastically reduce incidences of contracting curable STIs.

 

Links

Kiplelgo Edwin Kiplagat

Kiplelgo Edwin Kiplagat

Student Bio

Having completed my High School education from Tenges Boys’ High School, my college journey began at Moi University where I pursued BSc in Applied Statistics with Computing. Since then, I have worked as a management Trainee with the Ministry of ICT and later as Statistician with the Kenya medical Research Institute, where I developed the interest in applying statistical methodologies to address health care challenges. This ambition influenced my enthusiasm to pursue a master’s degree in Biometry from the University of Nairobi. My career aspiration is to continually apply my knowledge and expertise in solving health care problems.

Project Summary

Project Title:Trends in Low Birthweight Deliveries as an Indicator of Malaria Transmission

Project Abstract

Low birthweight (LBW < 2500 g) is a phenomenon that is more pronounced in developing countries where infectious diseases are most prevalent. Malaria infection in all endemic areas of Sub-Saharan Africa have become an important factor associated with LBW during pregnancy with increased susceptibility to mothers of lower parities. Hence LBW can serve as an indicator of malaria transmission. Part of its control mechanism is to study the traits that are linked to its spread. This study examined trends of LBW prevalence in Kilifi Health Demographic and Surveillance System area. Trend significance was assessed using the Mann-Kendall test while variations of LBW prevalence were assessed by the monthly seasonal indices obtained from the Moving Average Method. Change point analysis was conducted to establish point in time when significant change in LBW prevalence occurred. Seasonal Autoregressive Integrated Moving Average model that described the LBW prevalence over time was fitted to assess the trend in the predicted values. Additive Logistic Regression was used to obtain Odds Ratio of LBW among primiparity with reference to multiparity and interpreted in relation to the contextual information regarding the changing landscape of malaria transmission. Findings from the study revealed a significant decreasing trend of LBW prevalence. Variations of LBW prevalence could be explained by changes in the climatic conditions. Odds ratios for LBW among the primiparity could be used to define the transition of malaria in Kenya. Findings hereby, can help the government improve on the measures to combat malaria transmission in the mostly affected areas.

Links

Mburu Rachael Waithira

Mburu Rachael Waithira

Student Bio

Rachael Mburu is a data analyst with three years of experience in the field of research and data analysis. Previously worked for The International Centre of Tropical Agriculture (CIAT) as a research and data analysis consultant and currently working for Triggerise as a data analyst. She has had experience in data analysis and data visualization using R software, STATA, SPSS, Tableau, and Power Bi.

Project Summary

Project Title :Comparison of Elastic Net and Random Forest in identifying risk factors of stunting in children under five years of age in Kenya.

Project Abstract

Children with a Height-for-Age (HAZ) below -2 Standards Deviations based on the World Health Organization (WHO) child growth standards median are said to be stunted. According to Kenya Demographic Survey (KDHS, 2014), the national prevalence of stunting among the under-five children was 26\% which was relatively higher than the average prevalence of developing countries which is 25\%. This work compares Random Forest and Elastic Net in identifying determinants of under-five childhood stunting with Variable Importance as the key outcome. The Kenya Demographic Health Survey (KDHS) women and children data was used for analysis. This data was cleaned using STATA and analyzed with R software. Due to the variance in the classes of the response variable, Synthetic Minority Oversampling Technique (SMOTE) was employed to obtain a balanced class data. Missing observations were imputed using function from library randomForest in R software. Random Forest and Elastic Net algorithms were used to obtain determinants of stunting while Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) Curve was used to compare the models. The top 5 factors in terms of importance according to Random Forest are: underweight status, region, child’s age, ethnicity, and mother’s current age. According to the Elastic Net algorithm, the top 5 important coefficient variables are: underweight children, Nairobi region, 60+ months preceding birth interval, 12-23 months old children, and children from Luhya ethnicity. In terms of the ROC values, Random Forest had an AUC of 0.92 while Elastic Net had an AUC of 0.86. Based on our findings, most of the top ranked important variables selected by Random Forest and Elastic Net are similar. Nevertheless, Random Forest performed better than the Elastic Net algorithm in determining the factors of under-five childhood stunting.

Links

Chelulei Kipruto Gideon

Chelulei  Kipruto  Gideon

Student Bio

Gideon Kipruto Chelulei, BSc. Mathematics, MSc. Applied Mathematics. Areas of interest for future research: Numerical Solutions of Systems of Ordinary Differential Equations, Dynamical Systems and Algebraic Geometry.

Project Summary

Project Title: Application of Runge-Kutta Methods for Solving Nonlinear Systems of Ordinary Differential Equations to Unemployment Model.

Project Abstract

Youth unemployment is on the rise globally and it poses huge economic, social and political challenges which results to unsustainable growth of the country’s economy. The main aim of this thesis is to focus on youth unemployment in Kenya by developing a mathematical model consisting of first order systems of Ordinary Differential Equations (ODEs) and solve it numerically and to give results from mathematical perspective. Firstly, we give introduction to Unemployment problem, ODEs, meaning of a solution to an ODE and also conditions for a problem to satisfy for it to be referred as well posed. We then discuss in brief analytical solution and also the numerical approach of finding a solution to an ODE. A consideration of error analysis,convergence,consistency and stability of numerical methods in general is discussed. We consider Runge-Kutta (RK) methods of different orders, derivation of Euler methods, second order, third order and RK method of fourth order are obtained and a brief formula for fifth order is given. Afterwards, error analysis for RK methods as well as the stability analysis are discussed. After describing the methods for a single equation, we focus on RK methods for solving systems of ODEs since it is our area of interest. We also analyse the stability for one-step and multi-step methods for ODEs. Once we are fully equipped with the necessary tools, we now obtained the numerical solution the model using RK method of fourth order and discuss the results.

Links

Kimani Mercy Wanjiru

Kimani Mercy Wanjiru

Student Bio

My name is Mercy Kimani. I am a Masters student taking MSc in Biometry at the University of Nairobi. I did my masters project at Kemri-Wellcome Trust Research Programme where, I used genetic data to study the association between vitamin D status and Tuberculosis disease. I completed a Bachelor of science degree in Biochemistry and Molecular Biology at South Eastern Kenya University. I have a great interest in statistical genetics and especially genomic epidemiology which is the field I am hoping to build a career in.

Project Summary

Project Title: A causal association of genetically predicted serum 25-hydroxyvitamin D concentrations on the odds of tuberculosis disease: A Two-sample Mendelian Randomization Study

Project Abstract

Tuberculosis is a major cause of ill health and death. In 2018, an estimated 10 million people became ill and 1.2 million died from the disease. Vitamin D deficiency is also a major public health problem, affecting about a billion people globally. Studies have reported an association between vitamin D deficiency and an increased risk of tuberculosis. Some of the mechanisms underlying this association may include the role of vitamin D in both innate and adaptive immune functions. This study aimed to investigate whether there is a causal association between serum 25-hydroxyvitamin D (25(OH)D) levels and odds of tuberculosis disease in Asian and European populations using two-sample Mendelian randomization. I used Mendelian randomization (MR), a technique that employs single nucleotide polymorphisms (genetic variants) as instrumental variables to estimate the causal effect of serum-25(OH)D (intervention) on the outcome of tuberculosis disease. I used genome-wide association studies (GWAS) summary statistics from the UK Biobank as the exposure data and from the International Tuberculosis Human Genetics Consortium (ITHGC) as the outcome data. Results from this study showed no evidence of a causal association of genetically predicted serum-25(OH)D concentrations on the odds of tuberculosis disease.

Links

Odhiambo Evance Ochieng

ODHIAMBO EVANCE OCHIENG

Student Bio

Evance Ochieng was born and grew in Wagwe Kobiero location, Rachuonyo North subcounty, in Homabay county, he was born in a family of three, He went to Kagai primary school, Madogo secondary school and later Joined the University of Nairobi where he took BSc and MSc.(Applied Mathematics).

Project Summary

Project Title:Numerical Solutions To Nonlinear Hyperbolic Systems Of Conservation Laws Applied To Traffic Flow Theory.

Project Abstract

In the recent past, the rapid growing number of vehicles on long crowded roads elicited rigorous scientific research activities in the field of traffic flow modeling. In this thesis we present and discuss some of the macroscopic models of vehicular traffic flow; here we discuss the Payne-Whitham(P-W) and the Aw-Rascle models of traffic flow both of which are second order. We study the Riemann problems of these models. The numerical method developed here is the finite volume method (FVM), more specifically the Godunov-type approximation together with the CFL condition for stability test of solutions.

Links

Thiong'o Joseph Magu

Student Bio

Joseph Magu Thiong'o is a University of Nairobi school of Mathematics candidate for graduation on Dec 11/12/2020. He has sucessfuly done masters in social statistics and had a project on modelling the key determinant of child labour in kenya. He has great love in mathematics , mainly on modelling and analysis of data. He has financed his degree and masters education. A pone graduating he intents to do PHD in SOCIAL STATISTIC. Due to financial constrains he would be pleased to acquire PHD scholarship. He has a degee on science education (BED MATHEMATICS) from the Mount Kenya University and a First class Diploma from Kenya Science campus of the university of Nairobi. He did his high school and and primary school education at Nyagatu Boys and Muthangari Primary school. He has been a teacher service employee as a high school mathematics and chemistry teacher for twelve years. Here he has been involved in data management and analysis. Has also been involved in Kenya Beaural of Statistic in Data management.

Project Summary

Project Title:Modelling The Key Determinant of Child Labour in Kenya

Project Abstract

Child labour is an eect of many factors that are addressed in the MDGs, SDGs and verious policy documents. In the listen years , programmes and policies have not been established out to address the issues of child labour owing to the fact that this has not been adequately captured or analysed in national data and statistics. The main objective of this study is to investigate the key determinants of child labour in Kenya. The study focused on children of the aged between 5 and 14 years using the KNBS Household survey Data of 2017. Mixed eect binary logistic regression was conducted to analyse the data. The explanatory variables are: child age and sex,household size,family head gender, type of household residence, relationship of a child to the household head, household head level of education, hours spent by a child on household chores, average monthly household income and expenditure and area of residence. The model results show that the age of a child, the highest grade attended by the household head (household head education),average household monthly income, hours spent by a child in carrying out household chores and area of residence are important determinants of child labour in Kenya. The ndings indicate that the chance for child to be engaged in work increases with age. Household income has negative inuence on the chance for child labour. Higher level of education of the household head decreases the chance of sending child to work. In addition, increase in hours spent on household chores increases possibility of child labour. Lastly, the type and are of residence signicantly aect child labour. Policy interventions to be enhanced for reduction of child labour are improving households living conditions by increasing their average monthly income. Raise adult literacy levels. Reduce hours spent by children in taking household chores and enhance gender equality in education. Address regional disparities in probability of child labour by allocating more educational resources to the devolved government units with high child labour probability.

 

Links

Mutisya Jannis Ndungwa

Mutisya Jannis Ndungwa

Student Bio

I was born and raised in Machakos county where I attended Yikiatine Primary School between the year 1995 and 2003. Upon scoring 402/500 marks in the 2003 KCPE exam, I joined St. Angela's Girls' Secondary School in Kitui County between 2004 and 2007, where I scored A-(minus) in KCSE. I was enrolled at the University of Nairobi in 2009 to study Bachelor of Economics and Statistics and graduated from the university in 2013 with Second class honors (upper division)

Project Summary

Project Title :Evaluating Generalized Linear Models For Count Data With Application To Pre-Exposure Prophylaxis HIV Sero-Conversion Data.

Project Abstract

Generalized Linear Models (GLMs) are a strategy for tackling statistical questions, especially those that involve non-normally distributed data, in such a way that much of the simplicity of the linear model is retained . This study was aimed to evaluate Generalized linear models for count data with application to Pre-Exposure Prophylaxis HIV sero-conversion (PrEP) data. This study used data that was retrieved from Kenya Health Information System (KHIS) for the period March to April,2019 from 104 health facilities. Poisson Regression Model, Quasi-Poisson Regression model, Negative Binomial Regression Model, and Conway-Maxwell Poisson regression models were compared to determine the best model which can be used in modeling HIV sero-conversion among PrEP users in Kenya. The model with the best fit was checked using Akaike information criterion (AIC). From the results, the Conway-Maxwell Poisson regression was considered a better model when analyzing PrEP data in Kenya since its AIC value was the least.

Links

Njeru Edwin Murithi

Njeru Edwin Murithi

Student Bio

Edwin Murithi is young Kenyan born and brought up in Embu. He first graduated from the University of Nairobi on 2005 with a first class in Mathematics. Then in 2013 he graduated from the same University with an MBA in Finance. Edwin is seasoned auditor having being an Internal Auditor for over 10 years in the financial sector. His interest in and out of class is more in large data analysis.

Project Summary

Project Title: USE OF HAAR WAVELETS TO SOLVE OPTIMAL CONTROL PROBLEM.

Project Abstract

Optimal control is an important branch in mathematics that has been widely applied in a number of fields including engineering, science and economics. We aimed at finding the performance indicator in optimal control problem for the best control by solving a nonlinear partial differential equation known as Hamilton Jacobi Bellman. In this project we established the value addition and advantages of using Haar Wavelets in solving optimal control problem by looking at the fundamental of the optimal control theory then Hamilton Jacobi Bellman and finally application of Haar Wavelet method by solving sample problems. Finally, we found out that with the Haar Wavelet function, we obtained very satisfactory results and that it was in deed of value addition in the computation of optimal control problems.

Links

Lokaran Elizabeth

Lokaran Elizabeth

Student Bio

Am LOKARAN ELIZABETH, a trained science teacher with a diploma from Kenya science Teachers college-Mathematics and chemistry and graduated in the year 2001. Also a holder of B.ED(SCIENCE)-Mathematics and Chemistry from Kenyatta university and graduated in 2011. Holder of Masters in Applied Mathematics from the University of Nairobi class of 2020. My areas of interest are: mathematical modelling of infectious disease(HIV/AIDS), Ordinary differential equations(ODE's), Partial differential equations(PDE's), Numerical analysis and study of dynamical systems.

Project Summary

Project Title :A mathematical model on effect of drug abuse in transmission of HIV/AIDS in Turkana county

Project Abstract

Drugs and substance misuse has been recognized to have a significant impact on the spread of HIV/AIDS epidemic. We formulate a deterministic model to assess the contribution of drug and substance abuse to the spread of HIV/AIDS disease among adults in Turkana County, Kenya. The basic reproduction number of the model is determined and stability of equilibria analyzed. The disease free equilibrium point is shown to be globally asymptotically stable when its corresponding basic reproduction number is less than unity. The Lyapunov theorem is used to show that the endemic equilibrium point is Globally asymptotically stable when its corresponding reproduction number is greater than unity

Links

Okiabera Joel Omae

Okiabera Joel Omae

Student  Bio

My name is Joel Omae Okiabera. I work with the Independent Electoral and Boundaries Commission as a Constituency Elections Coordinator. My dream is to one day be the Chief Executive of a large organization. To achieve this, I'm committed to better my skills by attaining highest level possible of education. Attainment of Masters of Science in Social Statistics has been my biggest achievement so far.

Project Summary

Project Title:Using Random Forest to identify key determinants of Poverty in Kenya

Project Abstract

Under the SDG’s set by the United Nations, it was estimated that all forms of poverty will be eliminated by the year 2030. Although Kenya has made tremendous improvements in poverty reduction, it is unlikely to eradicate it by the year 2030. Studies of poverty determinants in Kenya have mostly been done using classical regression methods. The world bank has suggested the use of Random Forest technique, as it is more robust in studying determinants of poverty. This study applied a random forest technique to KDHS 2014 dataset to explore poverty determinants in Kenya. The data used to analyze the critical determinants of poverty was taken from the Demographic and Health Surveys (DHS) for Kenya of 2014.The outcome variable is the wealth index categorized in five levels ranging from poorest, poorer, middle, richer and richest. The independent variables included, region, type of residence, education level, sex of the household head, marital status, number of household members and age of household head. The 2014 KDHS dataset has the wealth index of an individual coded for the five categories, while the determinant variable are both categorical and continuous. Random Forest is an algorithm used for classification and regression usually constructed from a set of classification and regression trees. The random forests are a significant improvement from classical regression techniques. Regional residence and level of education details should be considered when interventions are being made for improvements of livelihoods in the country.

Links

Terer Mercy Chepkirui

Terer Mercy Chepkirui

Student Bio

Mercy Chepkirui Terer is a data scientist with vast experience in software engineering and data management. Ms Terer did her bachelor of science degree in computer science at the University of Eldoret(2010-2014). She has demonstrated technical analytical and research skills in several projects that have led to effectiveness in data usage and quality in survey results at KEMRI-Wellcome Trust Research Programme where she currently works. Her background in computer science has greatly shaped her computational and algorithmic problem-solving skills. She holds a master’s degree in biostatistics from the University of Nairobi(2018-2020).

Project Summary

Project Title:A Comparative Analysis of Unsupervised Outlier Detection Methods for Data Quality Assurance

Project Abstract

Data quality assurance is a key component in research. It is almost impossible to routinely check for errors in large datasets if automated smart mechanisms are not put in place. Good quality data leads to effective and unbiased reporting. Errors introduced into the data are inevitable hence the need to have error-checking mechanisms. Error checking mechanisms such as the use of range checks, quantile ranges, and z-scores are limited to continuous data types and small feature space data. Errors in dichotomous and character data types are easily omitted hence the need to use methods that scan anomalies for all data types in large datasets. Two-Pass Verification (TPV) on the other hand is a gold standard method for checking the quality state of data. However, it is a tedious and manual process that relies on random sampling for larger datasets. We propose possible alternative methods for error checking by applying machine learning outlier detection algorithms. The observations that are outlying are subjected to cross-referencing for possible errors instead of randomly selecting a set of observations. We evaluated k-means clustering and isolation forest unsupervised machine learning algorithms to detect outliers. We then compared TPV, k-means, and isolation forest anomaly scores. Normalized mutual information score and the coefficient of determination metrics were used to determine the strength of the correlation. The results indicated that unsupervised machine learning methods can be possible alternatives for data quality assurance. Isolation forest performed best compared to k-means clustering.

Links

Owino Matilda Awuor

Owino Matilda Awuor

Student Bio

Matilda is an enthusiast statistician who has developed great interest in research work through empowering herself academically. This passion that she has for statistics, generally ,was developed at her undergraduate level with the main aim of wanting to know more than just what she knew. This pushed her to go for further studies, and to her, for sure, masters degree is just but one of the many steps ahead in the world of Statistics and Research.

Project Summary

Project Title:Optimal Multi Type Step Wise Group Screening Designs-A Numerical Method Approach

Project Abstract

This project considers optimum stepwise group screening designs .Numerical methods are used to obtain the optimum group sizes. Newton's method as a numerical method , is used as a mathematical tool, statistically, in the minimization of Expected Total number of runs(tests) in the Stepwise group screening design for selecting and separating the defective factors from a population entailing both the defective and non-defective factors(observations). The minimization of the expected total number of runs is obtained or ascertained when the optimum sizes of the group factors are obtained as required from the performance of the several iterations in Newton's method, for different stages, i. e, stage one(one group-size), stage two, (two group sizes) and stage three, (three group-sizes).At each particular stage, comparisons are made with the calculus method used in obtaining the optimum group sizes in the procedures used by other researchers. All these two procedures are afterwards confirmed with the results from the computer search.

Links