# Key concepts in bio-statistics

Absolute risk increase ARI - The increase in risk with new treatment versus risk without treatment. Absolute risk reduction ARR - The decrease in risk with new treatment versus risk without treatment. Absolute value - Absolute value. Adjusted rate - Standardized rates for variables that affect the calculated rate. Age specific mortality rate - Mortality rates by specific age groups. Alpha α error, type I error - First type error. Alternative hypothesis - Alternative hypothesis - an alternative hypothesis for the null hypothesis. Analysis of variance ANOVA - A statistical method that determines whether there are differences between two or more groups of subjects for one or more factors. Test F is used to perform ANOVA. Analysis of covariance ANCOVA - a statistical method of variable processing Analysis of variance or regression used to control the possible effect of confounding variables in haze. Bar chart or bar graph - A chart or graph with nominal characteristics showing the prevalence or relative prevalence. Baye's theorem - a formula for calculating the conditional probability of one event, PAB, by the conditional probability of another event, PBƖ. Bell-shaped distribution - A term used to describe the Gaussian normal form of distribution. Beta β error - See the two-probability error of rejecting the researcher's hypothesis by mistake. Binary observation - a nominal metric that has only two sex sample results: male or female; Survival: Yes or no. Binomial distribution - The probability distribution showing the number of X successes observed In n independent studies, with each probability study being the same as occurring. Biometrics - Study of measurements and statistical processing in medicine and biology. Biostatistics - Application of research design and statistical processing for applications in medicine and biology. Blind study - An experimental study in which subjects do not know what treatment is being given; The researchers can also be "blind" to the treatment being given to the subjects, see double-blind trial. Block design - a relevant concept in different analysis, research design where the subjects in each block or cluster belong to different treatment. Bonferroni t - A method for comparing averages in different analysis, is also called Dunn's multiple-comparison procedure. Bootstrap - A method for estimating standard errors or confidence intervals in which a small sample of observations from the original sample is randomly selected. The estimates are calculated each time from the small sample selected and the sample is returned to the original sample. This process is repeated many times in order to create a distribution according to which the estimates will be defined. Box plot - A graph showing the occurrences and distributions of observations. This graph is useful in comparing two distributions. Case-control - An observational study that begins with subjects who have the result or disease being investigated and continues to control subjects who do not have the result or disease being investigated. After the study, try to identify preexisting symptoms or risk factors. Case-series study - A list of interesting or intriguing characteristics observed in a group of subjects. Categorical observation - A variable whose categorical values ​​are example: anemia, diabetes, side effects. See Nominal scale. Censored observation - An observation that is unknown, most often because the subject has not been in the study long enough to assess the outcome we are interested in, for example: death that occurred or dropped out of the study. Central limit theorem - A theory that states that the distribution of sample averages is roughly normal if the sample size is large enough n≥30, without reference to the basic distribution of the original measurements in the specific sample we deal with is also called the central boundary theorem. Chance agreement - Measuring the proportion of times when two or more raters agree on their measurements or estimates for a particular phenomenon. Chi-square χ2 distribution - Distribution used for statistical processing in common tables and comparing distributions. Chi-square χ2 test - A statistical test used to test the null hypothesis that the proportions are equal or, in a reasonable way, that the factors or characteristics are independent or unrelated to each other. Clinical epidemiology - Applying the science of epidemiology in clinical medicine and decision making. Clinical trial - An experimental study of a drug, procedure or medical device when the patient is Man. Cluster analysis - A statistical method that determines classification and can group or consolidate multiple measurements in a series of objects or subjects. Coefficient of determination r2 - The square of the correlation coefficient, the amount or percentage of variance in one variable derived from the knowledge of the other variable. Coefficient of variation CV - The average partial standard deviation is usually multiplied by 100. Used to measure relative variance. Cohort - A cohort, a group of subjects who remain in the same study over time. Cohort study - A cohort study, an observational study that begins with a group of subjects with a risk factor or who were exposed to some risk factor and a group of subjects without the risk or exposure factor. Both groups will be monitored over time to learn how many of each group will develop the disease outcome or the outcome of the outcome. Combination - A probability formula that gives several options for a certain number of items, say x, to choose from a total number of items, say n, of the total population or sample. Concurrent controls - A control group assigned to a placebo or control condition during the period of the experimental treatment or process. Conditional probability - the probability of an event occurring, for example A, given that another event, e.g., B occurred. This probability is specified as PAIB conditional probability. Confidence coefficient - The expression promotes a formula for confidence interval that determines the level of confidence associated with confidence interval, for example 90%, 95% and 99%. Confidence interval CI - The confidence interval calculated from a sample, which gives the probability that the unknown parameter, such as mean or proportion, is included in the interval. Acceptable confidence intervals are: 90%, 95% and 99%. Confidence limits - the confidence limits of the interval. These boundaries are calculated from a sample and there is a given probability that the unknown parameter is located within them.

Confidence limits - the confidence limits of the interval. These boundaries are calculated from a sample and there is a given probability that the unknown parameter is located within them. Confounded - A term that describes research or observation that presents one or more confounding variables that may lead to misinterpretation. Confounding variable - An independent variable that may affect the outcome variable and differs among the research groups. Content validity - A measurement that confirms that the items in the test or measurement scale represent the dimensions being measured. Contingency table - A table used to display frequencies for two or more of the nominal or quantitative variables. Continuity correction - Matching a statistical test when a continuous probability distribution is used to estimate a discrete probability distribution. For example, using a live squared distribution to analyze common tables of nominal cases. Control subjects - In a clinical trial, subjects were classified into a placebo or control group. For example, in a case study study, subjects without disease. Controlled trial - An experiment in which subjects were classified into a placebo or control group. - Correlation coefficient r coefficient correlation between 2 variables, its range is between 1- to +1. When 0 indicates that there is no relationship between the two variables and 1 or 1+ indicates a perfect relationship when testing a linear relationship, we call this coefficient The Pearson coefficient. Cost-benefit analysis - Quantitative methods for evaluating the trade-off relationship between the costs or disadvantages and the benefit or benefits of a procedure or management strategy. Cost-effectiveness analysis - A quantitative method to evaluate the cost of a procedure or a management strategy that also takes into account the result, to choose the option with the low cost. Covariate - a variable that may cause confusion and is controlled by variance analysis. Cox proportional hazard model or Cox model - A regression method used when the result is censored. Cox regression model coefficients are decoded as relative risk Hazard ratio. Critical value - The threshold value required so that if the test statistic value is greater than its absolute value, then the null hypothesis is rejected. Crossover study - A clinical trial in which each group receives two or more treatments, but in a different order. Cross validation - a procedure for applying the results of the analysis from one sample of subjects to another sample of subjects to assess how repetitive the results of the calculations are, commonly used in regression. Crude rate - the gross proportion of the population that is not specific or tailored to each individual of the population. Cumulative frequency or percentage - In the incidence table, the prevalence or the percentage of observations obtained up to a given value in addition to all values ​​below this cumulative incidence value. Decision analysis - a model for analyzing a decision-making process. Degree of freedom df-degrees of freedom, a parameter given in a number of common probability distributions. For example, t distribution and live squared. Dependent groups or samples - Results of values ​​obtained in one group and can predict results of values ​​in another group. Dependent variable variable whose values ​​are the results of the study. Also called the outcome variable of the End-Point study. Dependent-groups t test- See Paired t test. Distribution - The occurrence of the values ​​that a particular variable receives. Distributions may be based on empirical observations or theoretical probability distributions. For example, theoretical distributions: normal, binominal, squared. Double-blind trial - A clinical trial in which both subjects and researchers do not know what treatment the subjects received. Dummy coding A procedure where the code 0 or 1 is for a predictive nominal variable, used in regression models. Dunnett's procedure- multiple comparison method, to compare many treatment groups to a single control group, the significance was calculated using the F test. Effect or effect size - the size of the difference or context. Specified to calculate the required sample size. It is also useful for evaluating the clinical significance of the study results as well as being useful in meta-analysis studies. Error bar plot - A graph showing the mean plus a standard error dispersion index for one or more groups. - Error mean square MSE The average amount of squares defining the denominator of F ratio in the ANOVA model. Evidence-based medicine - Application of evidence / evidence based on clinical research and clinical expertise. Expected value - The expected value to be assumed assuming no effects of any intervention. Experimental study - a comparative study involving intervention. This is called a clinical trial when humans are involved. F distribution - A probability distribution designed to test the equality of two estimates of variance. This is the distribution used with the F test in ANOVA. F test - the statistical test that compares two variances. It is used in the ANOVA model. Factor analysis - A statistical method for examining the relationship between a set of items or metrics, for determining the factors or metrics with a common denominator. Factorial design - ANOVA experimental design, as each subject or item receives one level from each factor. False-negative FN - A negative test result for a subject who has the disease. False-positive FT - A positive test result for a subject who does not have the disease. Fisher's z transformation - Transformation of the correlation coefficient to be distributed normally. Frequency - The number of times a given value is observed. Frequency distribution - A series of numerical observations, the list of values ​​that appear along with the frequency of occurrence. Also displayed as a frequency table or a frequency graph. Frequency table - A table that shows the number or percentage of observations that occur at different values ​​or range of values ​​of a particular dimension or variable. Gaussian distribution - See Normal distribution. Geometric mean GMT - The n root of the n observation observations, indicated as GMT or GM. Used with logarithmic scales or biased distributions. Gold standard - In a diagnostic test, a procedure that defines the true state of the patient with or without the disease.

Hazard function - The probability of a person's mortality in the time span ends, given that the person lives until the beginning of the time span. Hazard ratio - Similar to the Risk ratio, this is the risk ratio of the outcome such as death occurring at one time in one group compared to another. Hierarchical design or Nested design - Research programs in which one or more treatments are nested within levels of another factor, such as patients in hospitals. Histogram - A graph of the prevalence distribution. Historical cohort - A cohort study that uses existing records or historical data to determine the effect of risk factor or exposure on any outcome variable. Historical controls - In clinical trials, previous observations collected on patients are used and used as a control group against comparative therapy useful in clinical trials. Hypothesis test - a statistical hypothesis test by which we reject or accept the H0 hypothesis. Incidence - the percentage / incidence rate that gives the proportion of people who develop a given illness or condition during a certain period of time. Independent events - Events that appear or have no effect on the probability of the other. Independent groups or samples - Samples whose values ​​in one group cannot be predicted by the values ​​in the other group. Independent observations - Independent observations. Independent variable - the predictor variable in the study. Sometimes the factor or variable is called ANOVA. Independent-groups t test - See Two sample t test. Intention-to-treat - Statistical analysis of all subjects according to the group to which they belonged at the beginning of the study the intention-to-treat principle. Interaction - The relationship between two independent variables that may have a different or shared effect on the dependent variable. Interquartile range - The range defined by the difference between the 25th percentile and the 75th percentile. Inter-rater reliability - the reliability of measurements made by two or more evaluators for the same test. Intervention - intervention in research or experimentation. There may be a cure or treatment procedure. Intra-rater reliability - the reliability between measurements made by the same evaluator at two different time points for the same test. Kaplan-Meier product limit method - a method of survival analysis for censored observations. Kappa - A statistic used to measure intra-rater or inter-rater correlation for nominal variables is customary for the marker in k. Length of time to event - Measures the length of time from start of treatment or follow-up until the outcome occurs. Level of significance - The upper threshold of the probability that we are willing to accept a false hypothesis rejection in hypothesis tests. Life table analysis - A method for analyzing survival times for censored observations grouped in intervals. Likelihood ratio - In a diagnostic test, the rate of positive results is correct for the false positive results. Linear combination - Weighted average of a set of variables or measurements. For example, the multiple regression predictor equation is a linear combination of predictive variables. Linear regression - The process for determining a regression or predictive equation to predict Y by X. Linear relationship - A ratio that indicates that X and Y vary together according to fixed intervals. Logarithm Ln - 2.718e This strong boost gives the natural logarithm. Logistic regression - A regression technique used when the result variable is a binary variable. Log-linear analysis - A statistical method for examining the relationship between three or more nominal variables. This method may be used as a regression method to predict a nominal result by independent nominal variables. Longrank test - A statistical method for comparing two survival curves when censored observations occur. Longitudinal study - Long-term research done in longitudinal research. Mann-Whitney-Wilcoxon test - See Wilcoxon rank sum test. Mantel-Haenszel Chi-square test - A statistical test of two or more 2 × 2 tables. Used to compare survival distributions or to control for confounding factors. Marginal frequencies - Total prevalence column and total and prevalence row in random table. Also called marginal incidences of the table. Marginal probability - The probability column and the probability row in random tables. Table marginal probabilities are also called. Matched-groups t test - See Paired t test. or matched groups Matching - A process of matching a variable between two groups. For example, match by age or gender and the like. McNemar's test - a live squared test to compare proportions of two dependent or paired groups. Mean X - The most common measure to estimate the distribution center of the results is labeled as μ in the population and in the sample as X. In the sample, the mean is the sum of X values ​​per part in n of the sample X / n. Mean square among groups MSA - An estimate of the variance between groups in different analyzes. Used as counter in F statistic Mean square within groups MSW - An estimate of the variance within the groups in different analyzes. Used as a F statistic. Measurement error - The amount in which measurement is incorrect for reasons that arise during the measurement process is also called bias. Measures of central tendency - A measure or summary of numbers that represent the middle distribution. See Mean, Median, Mode. Measures of dispersion - A measure or summary of numbers representing the dispersion of observations around the mean. See Standard deviation. Median M or Md - A measure of the center trend. This is the value found in the middle of the observations when they are sorted from the smallest value to the largest value, also called the value that divides the distribution of values ​​into halves. Also equal to the 50th percentile. Medical decision making or analysis - Implementation of probabilities in the decision making process in medicine. This is the basis for a cost-benefit analysis model. MEDLINE - A system that allows bibliographic data search of all articles in journals including Index Medicus. Articles that meet a certain criterion or contain certain keywords are retrievable for the purpose of reviewing the researcher and for reviewing scientific literature in the medical field.

Sample - part of population.Sampled population - The population from which the sample was selected.Sampling distribution of a statistic - The prevalence distribution of a particular statistic in many samples. Used for statistical purposes from a single study.

Sampling frame - A list of all subjects or individuals in the population from which a random sample was selected. Several types of samples are required, for example systematic sampling.

Scale of measurement - The level of accuracy in which a variable is measured.

Scatter-plot - A 2D graph showing the relationship between two continuous variables. Scheffe's procedure - a method for multiple comparisons between post-F averages in different analysis. This is the most conservative approach to multiple comparison. See detail in the unit comparing multiple comparisons. Sensitivity analysis - In decision analysis, a method for determining the range of options to decide is defined as a function of probabilities. Sensitivity - The proportion of subjects in whom the diagnostic test is positive for all patients with the disease. A test with a high sensitivity rate will have a low FN rate. Sign test - A non-parametric test used to test a single group's median hypothesis. Simple random sample - A random sample in which each of the n subjects or individuals in the sample is equally likely to be selected. Skewed distribution - A distribution where there are some observations that are distant from the center in one direction only. If observations from the center are low, the distribution tends to the left or negative bias. If observations from the center are high, the distribution tends to the right or positive bias. Slope of regression line - The quantity in which Y varies for each unit of change of X. The slope is represented by b in the regression equation. Spearman's rank correlation rho - a non-parametric correlation that measures the relationship between two measurements. Specific rate - A rate that refers to a particular group or part of the sample observations: age-by-cause mortality or cause-of-death rate Specificity - Proportion of subjects with negative diagnosis out of all healthy or non-sick subjects. A high specificity test will have a low FP rate. Standard deviation SD - The most commonly used measure of data distribution around the mean, denoted by σ in the population and SD or s in the sample. Standard and mean deviation can be used to describe the distribution of observations. The standard deviation is the square root of the mean of the squares deviation from the mean of the observations. Standard error SE - Standard deviation of the sample distribution for a particular statistic. Standard error of the Sy.x estimate - the measure of variance in the regression line. This measure is based on the differences between expected and expected values ​​for the dependent variable Y. Standard error of the mean SEM - the standard deviation of the mean in a large number of samples. Standard normal distribution - A normal distribution with mean 0 and standard deviation 1 is also called z distribution. Statistic - Sample summary numbers: mean, median, standard deviation, usually used as an estimate of a parameter in a population. Statistical significance - usually interpreted as the probability of the result being given by chance, for example: once out of 20, with a value P less than or equal to 0.05. Statistical significance appears when the null hypothesis is rejected and it basically defines the chance of rejecting the null hypothesis by mistake or in other words, declaring that there are differences or something happened even though there are no differences and nothing happened. Statistical test - A process performed to test the null hypothesis for example: t test, test Living squared. Sums of squares SS - Quantities calculated in different analysis to help determine the mean of squares in the F test. Survey - a survey, observational study with cross-sectional study design. Generally used for gathering opinions and as a preliminary study based on its results we will design more advanced research. Survival analysis - A statistical method for analyzing survival data when there are censored observations. Symbols - Greek letters represent the population parameters and Latin letters represent the sample statistics. Symmetric distribution - Distribution of the same shape on both sides of the average. Average, median and mode are close and located at the center of the distribution. This is the opposite of a biased distribution. t distribution - Symmetric distribution with mean 0 and standard deviation greater than normal distribution for small sample sizes. As n increases, the t distribution approaches a normal distribution in samples with a size of 30 and above. t test - the statistical test that compares average with norm or standard or to compare two averages of small sample size n≤30. This test is also used to test whether the correlation coefficient or regression coefficient is zero in case the sample size is greater than 30, the t test will be equivalent to the z test. Target population - the population that the researcher hopes to include in his conclusions from the sample. Test statistic-specific statistics A statistical test that is used to test the null hypothesis for a statistical t or live-squared statistic, which is equivalent to a t test or a test. Living squared. Third quartile - 75th percentile. Transformation - Changing the Scale of Variable Values ​​- Transformation of the LOG sample values ​​is a common and common type of transformation in medical research data. Treatment threshold - In a diagnostic test, the point where the optimal decision is to treat the subject before performing a diagnostic test. True-negative TN - A negative test result for a subject who does not have the disease. True-positive TP - A positive test result for a subject with the disease. Tukey's HSD Honestly significant difference - A post hoc test aimed at making multiple pair comparisons between averages after performing the F test for variance analysis. This is one of the most widely accepted and recommended methods among statisticians. Two-sample t test - A statistical test used to test the null hypothesis when two independent or unrelated groups have the same mean. Two-tailed test - A test in which the alternative hypothesis specifies a deviation from the null hypothesis in one of the directions. The critical region is at both ends of the distribution of test statistics. This test is also called a directional or bilateral test. Two-way analysis of variance - ANOVA with two independent variables. Type I error - An error caused when the correct zero hypothesis is rejected by mistake or when differences are detected even though there are no actual differences. Type II error - An error caused when an incorrect null hypothesis is not rejected inadvertently rejecting or when no differences are detected even though there are actual differences. Validity - The accuracy of measurements indicating the quality of the variables.

Variable - The target variables or variable that we are interested in in the study, the variable has different values ​​in different subjects or details. Variance - 2σ in population, s2 in sample. The standard deviation is the root of the variance. Variation within subject - Variance of measurements in the same individual or individual. May appear naturally but may also indicate a mistake. Vital statistics - Mortality and morbidity rates used in epidemiology and public health. Weighted average - The average obtained by multiplying any number of series by a number called the weight of the occurrence of the number or the values ​​obtained, the multiplication of the values ​​obtained by the received number gives the weighted average and the weighted division gives the weighted average. Wilcoxon rank sum test - a non-parametric test to compare independent samples with ordered data or with non-normal numerical observations. Wilcoxon signed ranks test - a non-parametric test to compare dependent samples with ordered data or with non-normal numerical observations. Z approximation - The z test is used to test the equality of two independent proportions. Z distribution - a normal distribution with a mean of 0 and a standard deviation of 1. Also called the standard normal distribution or the standard distribution. Z ratio - the test statistic used in the z test. Is obtained from the estimated average subtraction from the average found and the standard error of the mean divided. Z score - the deviation of X from the average of the standard deviation. Z test - a statistical test for comparing averages of large samples n≥30. Z transformation - The variable conversion subdivided normal with mean and standard deviation into z distribution with mean 0 and standard deviation 1. 