** Lupine Publishers | Journal of Biostatistics & Biometrics**

## Abstract

This paper aimed at discriminating between women that ovulate in shorter time than the expected twenty-eight (28) days, those that ovulate in the expected twenty-eight (28) days and those that ovulate in longer than the universally expected twentyeight (28) days. A total of two hundred (200) women in their reproductive age interval were selected for the study. Questionnaires were used to get the relevant information relating to their ovulation interval. The factors affecting ovulation considered in the study include Age, Height, Weight, Work Time (stress), Menstrual Duration, Number of Conceptions, Number of Births and Exposure to Sun. The three-way linear discriminant function was formulated for the data. Using the formulated functions, the women were classified and it was observed that the probability of misclassification into short ovulation and normal ovulation when a woman is actually in long ovulation interval is 0.9863; the probability of misclassifying a woman into short or long ovulation interval when she is actually experiencing normal ovulation is 0.3; the probability of misclassifying a woman into long or normal ovulation when she is actually experiencing short ovulation is 1.0 and the total probability of misclassification of the discriminant function is 0.715.

** Keywords:**Three-way; discriminant analysis; ovulation; misclassification; multivariate; probability.

## Introduction

good knowledge of ovulation interval of women is imperative in preconception gender selection. Different beliefs and methods for procreation of off springs with desired sex have been adopted but recent development in research has shown that proper determination of ovulation interval has so much to contribute to the efficiency of sex determination. Discriminant Analysis has been applied to the post-mortem discrimination of Felis Catus by their sexes using skull measurements [1]. The level of fluctuation of ovulation interval in women is high and dependent on many variables. While some ovulate normally (every twenty-eight days), some experience short ovulation (not more than twenty-four days) and the other group experience long ovulation (greater than thirtytwo days). It is the objective of this paper to classify the women in the study population into their respective ovulation experience.

## Methodology

The data for this study were collected from 200 selected
women in Ika North-East and Ika South Local Government Areas,
Delta State. The women were mainly health workers and teachers
in primary and secondary schools in the selected area. Data
were obtained on their age, height, weight, work time (hrs/day),
menstrual duration, number of conceptions, number of births,
exposure to sun and ovulation interval. The women were classified
according to the length of their stated interval and the a priori
probabilities for the groups were obtained. Discriminant analysis
is concerned with the problem of discrimination between two or
more groups and assigning a new observation into a group with
low probability of misclassification [2-7]. Anderson [8] developed
a method for discrimination and classification which shows that for
known a priori probabilities and misclassification costs, the
optimum rule is based upon the likelihood ratio of all pairs of
multivariate normal populations f_i (X) and f_j (X). Then the ratio of
the i^{th} to the j^{th} density is

The region of classification into population, π_i is the set of X^’s for which Equation 2 is greater than k (k suitably chosen). That is,

The population parameters, μ_i, μ_j and Σ may be estimated with their respective sample estimators X ̅i, X ̅j and S where

is the pooled variance-covariance matrix for k groups.

For the common case of unknown parameters, the discriminant
function is

The discriminant function, Y ij, for every i≠j will discriminate between two completely specified groups (populations), π_i and π_j. The classification rule now is: assign X to π_i if

But if a priori probabilities f_{i}(X) and f_{j}(X) are known then k is
given by

as defined in Equation 1. If the two populations are equally
likely, that is, f_{i}(X) = f_{j}(X) ; and also the misclassification costs
being equal, that is C(i / j) = C( j / i) , this leads to k being equal
to 1. Then, log_{e}k = log_{e}k> 1 = 0. The best classification with known
and equal a priori probabilities and misclassification cost is: assign observation with measurement X to population π_i if Yˆ_{ij} > 0∀i ≠ j
; assign to π_j if otherwise. With the relations Y_{ij} = -Y_{ij} and
Y_{1i} - Y_{1i} = Y_{ij} , k , k populations or groups will require k-1 linearly
independent discriminant function(s) to obtain the best regions of
classification.

The discriminant procedure is evaluated with the aid of
confusion matrix. The Apparent Error Rate (AER), a measure
of the tendency that individual items are wrongly classified, is
appropriately determined as the proportion of misclassified items.
Let

where P_i is the probability of misclassifying an item that truly
belongs to the i^th group into any of i^’ groups.

For the case of three groups

where w_{ij} is the number of items misclassified into i while they
belong to j;< i ≠ j;C_{ij} i = j are correctly classified items.

## Data Analysis

x_i;i=1,2,…,9, represent Age in years, Height in meters, Weight in Kg, Work Time in hours/day, Menstrual Duration in days, Number of Conceptions, Number of births, Exposure to Sun in hours/day and x_9 is the ovulation interval in days. The average Ovulation Interval, x ̅9, is 16.06 for the 47 women experiencing short interval, 28.89 for the 80 women experiencing normal interval and 48. 95 for the 73 women experiencing long interval. The group variance-covariance matrices and the pooled variance covariance matrix were obtained. The two linearly independent discriminant functions, Y_12 and Y_13 was obtained

The Classification Rules: assign an individual with
measurement X of unknown origin to π_1 if Y_{12}>0.531879 and
Y_13>0.440312 from Equation 6; assign the individual to π_2 if
Y_{21}>-0.531879 and Y_{32}>0.0915672.P_3=0.9863, P_2=0.3 and P_1=1.
The misclassification probabilities show that 986 out of every 100
of the women experiencing long ovulation were misclassified as
experiencing short or normal ovulation; 300 out of 1000 women
experiencing normal ovulation were misclassified as experiencing
short or long ovulation; all the women experiencing short ovulation
were misclassified as experiencing long or normal ovulation and
the total probability of misclassification is 0.715 showing that 715
of the women were misclassified by the classification/discriminant
function.

## Conclusion

From the results of data analysis, the following conclusions may be reached: the discriminant function is associated with high probability of misclassification; there may be important variables excluded in the study and the selected women may be experiencing fluctuating ovulation intervals such that tracking a woman’s interval would require studying her experiences independently over a long period of time.

Read More About Lupine Publishers Journal of Biostatistics & Biometrics Please Click on the Below Link: https://lupine-publishers-biostatistics.blogspot.com/

## No comments:

## Post a Comment

Note: only a member of this blog may post a comment.