Population Health


The above pyramids are for Prevalence Rates.  On the left in red is a very rare ICD with short life span and high fatality rates.  Which of the disease types represented by the pyramids to the right, above, and below does it best resemble?

— Work in Progress —


Chances are this page is all you will have to review to understand how to interpret the health of a population based on what information you can gather about your research population, and how it compares with the national population health statistics profiles.

Population health pyramids are the same as population pyramids, with the exception that the focus is on health and what the population pyramid by itself implies about health, and what the age-gender distribution of our cases tell us about the health of that particular population we are monitoring or researching.

Population health pyramids have rarely been used because little effort has been made an published as to what roles this sort of data might play in developing health prevention and managed care programs.  The reason this use of the population pyramid in analyzing health has been ignored relates mostly to the lack of emphasis made on this important health monitoring technique.  This technique was very popular more than a century ago and when the first census related disease maps were generated as part of the standard decennial activity these pyramids were incorporated into the reporting process.  They were most informative about disease that are no longer a major public health concern today, such as infectious disease and the early deaths it resulted in, and adult onset diseases like tuberculosis, the death for which appeared to peak in people who were in their 40s.

Whatever the reason, only demographers and public health population specialists pay much attention to this important health monitoring method.  With managed care, demographic health is the focus, meaning that we are focusing on the sum of all cases for a particular illness, measuring that population’s health from one year to the next, without concern for add-ins and drop-outs.  The assumption being made with this routine is that the region that is managed by the program being evaluated is homogenous in nature, although it isn’t homogenous when interpreted at much smaller levels.  We assume that our region has a homogenous population and the who goes is is statistically replaced in the long run from one period of review to the next.  Of course variation occurs from one year to the next, but due to the relatively low rate of turnover for each metric being considered, it is hoped that this small change in numbers doesn’t impact the overall performance of our program.

To make this point even clearer in a statistical sense, assuming a 5% change poses a significant risk if that 5% change is significantly deviant from the original population features, such as replacing 5 men with women or 5 employed patients with 5 unemployed, or 5 high income with 5 low income.  In general, replacements themselves are a random experience in population settings, for the most part, and so a replacement of people leaving a program with 5 completely opposite individuals seems unlikely.  If 1 in 5 of your people for any study is very unhealthy, that means that one of these 5 who leave could be replaced by one that is opposite.  That one individual in turn would have an impact on just 1/100 people in the study, and so in theory generate only a 1% change, an outcome not at all close to the 5% limits imposed for error analyses.

Population health pyramids are useful for measuring events, treatments, diseases, disease complications, emergency visits, specific behavioral diagnosis or V-codes data, with the goal of better understanding your population.  Some of the best insights we get using this method are answers to the following types of questions:

  1. How old are most mothers when their are abused by their spouse and how does this relate number of children, location of residence, and income?
  2. At what two ages are women most likely to be smoking, down to the one year age bracket?
  3. At what age, at the one year level, do most men in your population die due to liver failure?
  4. At what age does a child peak in his/her psychiatric or learning behavior diagnoses?
  5. At what age is a women most likely to be diagnosed anorexic?
  6. At what age does a sexually active girl or women begin displaying urinary tract diseases other than STDs and what are these diseases that may serve as indicators of high risk?
  7. When is a male patient most likely to be diagnosed pyromaniac?
  8. When does a genetic blood condition like Thalassemia set in and does it impact women more than men?
  9. When does sickle cell set in its earliest, and is this early age more prevalent for women or in men?
  10. Based on survival features (not frequency of sexual activity), which gender has the likelihood of surviving with the gene the longest and to what peak age?


The following describes the age-gender distributions in one year increments for managed care patients with specific diagnoses.  The purpose here is to define the analytic process used to produce these population health pyramids and to describe some of the logic attached to how these graphs may be analyzed and used to implement new prevention programs that are more targeted and more capable of inducing change.


Part 1.  The Interpretation methods (Instructions and Key)

Over the years I developed a number of different ways to illustrate outcomes and test for test for significance in some statistical sense.  Generally we tend to use odds ratio to compare two outcomes when there is uncertainty as to how to compare and contrast two sets of results.    With odds ratios, we are still left without a measure of the significance of these differences, and so I developed a way to measure the statistical significance of these two sets of results.  I then measure that independently across all subgroups regarding age and gender, and graph these results with the population pyramid format so comparisons can be made.

The numbers generated and mapped are as follows:

  • n, by gender, male, female, in 1-year age ranges, 0-99 (99+)
  • n/N, for prevalence, with N = total population, male, female, in 1-year age ranges
  • Stat Sig, for prevalence and/or N, in 3-5 year age increments reviews a rolling counts/prevalences, compared between test population and national.  (Inter-gender can be done as well, and was in the beginning, but this use is much more value.  This is also used for numbers-range-adjusted values for any other age-gender 1-year increment evaluation, including costs, days late in refills, etc. etc.)

Study Background

When I began this study I was focused on the groups poorly represented by the HEDIS and NCQA work I was performing.  My impression was that this type of program was at about a 50% productivity in terms of potential.  Adding more studies however at this time would have been detrimental.  First, the performance rates had to improve, and then necessary additions or improvements made, and finally new metrics developed.

I would later learn the program I was in was one of the best for the region, with exceptionally high scores and intervention success rates compared with all others except one. In particular it helps to know that some of the worst performing businesses were only 50% successful in terms of eliciting essential changes (not recommended, essential).  Once the above goals were met a new step was added three years into the project–the determination of who and what diseases, behaviors, etc., were poorly represented.

So I produced a plot of age ranges reviewed by all the metrics I was using and came out with the following age-number of studies graph.


For the lower chart, y-axis is number of studies, in increments of 5.  Red is HEDIS, blue is Managed Care PIPs and NCQA QIAs.  Link to still more details about this.

As noted with the above, childhood was significantly missed.  I used this to define 300+ metrics for studying the entire age range, adding more childhood studies related to mental heath, sexual activity and diseases, intervention-PCP visit types, smoking and drug related metrics.  For adults, I added relatively underrepresented issues such as epilepsy and MS, certain visit types, V-codes and Emergency Department utilization data (which I often call E-codes).




Basic Pyramid interpretations

The following types of areal or regional interpretations of the spatial distribution of diseases were developed:

  • Region/Subregion (multiple state) analyses derived from HEDIS and Census regional definitions
  • Standard State based interpretations
  • Standard County based interpretations
  • Zip Code tract interpretations
  • Grid cell interpretations

For each of the above types, there were also some smaller studies engaged in for assessing the validity and reliability of this method, such as

Focus on parts of the Country due to regional climatic-topographic features that stand out in relation to the disease

  • Southeast and New England  fungal disease foci
  • Chicago area leptosporidiosis
  • Animal vector-host born diseases with defined foci (Lyme disease, west nile, all tick-related fevers and neurological conditions or diseases)
  • Animal-related ICDs with well documented ecological expectations (snake bites, scorpion bites, spider bites, etc)
  • Elevation related emergency visits
  • Latitude-Climate linked temperature related E-codes and related ICDs (Sleep Cycle disturbances, allergenic diseases, etc.)
  • Airborn landuse/industrial induced conditions (coal miners lung, etc.)
  • Commerce/Shipping related infestations (in-migrated diseases from foreign countries, approximately 175 of these)
  • Culturally-linked or culturally-bound diseases and potential in-migration routes (African, Japan, Orient, SE Orient, Africa, Russia, Australia, Middle America/Mexico, South America, Carribean)

Select or Special Area evaluations

Each of these had wither full US or continental proper as their main projection area methods.

Regional Interpretations


The upper left regional map depicts federally defined census data region definitions.  Most if not all federal data is also maintained regionally using this layout.  Each large regin has a storage place, usually an ftp site, where all national data is kept, stored and shared for research purposes.  Some of these sites are accessible, some are not.   With eCloud practices now beginning to take hold of this part of the IT world, the use of ftps has minimized.  These ftp sites still remain one of the richest sources of information, so long as you know how to interpret the raw data made available to you for national population demographics, environmental, health and spatial statistics.

The upper right map depicts how the nation was broken down into regions by the National Committee of Quality Assurance at the time this spatial analysis technique was in use.  It may no longer be in use for Federally Managed Care programs due to ongoing changes in the Federal Health Care system.

The lower map was used to break the U.S. down into more detailed units for population analysis.  Some eastern regions were subdivided because, by subtesting, it was found that several multistate regions were too large to represent useful research areas due to distinct demographic, cultural and census-derived distances between smaller parts of these large regions.  The Atlantic and Gulf Coastline states were impacted the most by this change in regional definitions.  This change had the greatest impact on certain family-related health metrics tested, for example by dividing the middle and southeastern portion of the US and the Texas to Tennessee portion into smaller research areas, diseases or ICDs impacting people of retirement age and  those related to childbearing-family health activities could be more precisely evaluated, in such as way as to determine specific needs for preventive health care practices in certain parts of this country.

Due to questions remaining when I developed this formula of queries, the final maps generated divided certain major sections of the country as defined by my map into two sets of subdivisions or subsection, usually defined by north-south and east-west dividing lines.  Then all variations could be tested and a more accurate interpretation of the overall region provided.

Populations and “Regions”


The above map of regions provides population distributions for each region.  The age ranges for these pyramids is 0-64.  The entire US population reviewed is to the left.

Notice how the Upper Midwest area has four pyramids.  The pyramids at 12 o’clock and 6 o’clock depict north versus south differences, and the other two east-west differences. The mid-Atlantic and southeast Florida sections, originally considered a single region for HEDIS reviews, was split to demonstrate population differences between the Capital District and the southeast section.  Other subregions were tested were for the New England area broken down into north and south halves, the Lower Midwest into Appalachia, Alabama and Texas regions, and the Rocky Mountains into North, Central and South.  The only regions not subdivided on the above evaluation are the Great Lakes (North central) region and the Pacific, which included the very low count populations for Alaska and Hawaii.




Part 2.  Interpretation Instructions

Peaks and Valleys


The above is an example of an older population pyramid modeling program I developed and ran.  Females are on the left, males are on the right.  This gender placement is opposite of the more recent formulas and modeling algorithms I developed which also appear on this page.

What are demonstrated by this pyramid are the peaks and valleys in age ranges for people actively engaged in  health care activities.  This is a count of the people involved at each age-gender bracket, not a count of their claims or amount of activity.

Notice the 0-1 year of age peak, followed by a decline to the lowest participation in life seen for children about 8 years of age.  This is followed by a peak in care for around 17-18 years.  For young to middle aged adult women, a small peak in health care is demonstrated for 30 years of age.  Due to the nature of this data gathering, this might include spousal care and care related to more preventive health measures engaged in by women versus men, and more tendency and willingness to visit a PCP when concerns about personal health arise.  There are also several small peaks for women appearing in their 30s and early 50s, followed by the major peak at 62 years of age.  Some of these relate to age-specific preventive activities like breast cancer and cervical screening activities, others are considered a normal consequence of aging and the need for more care as one gets older, for most of the cases.   For men, these early midlife peaks are lacking, and there is a tendency for men not to engage in health care activities until they are much older, close to retirement.  Their peak pretty much matches that of women, being at 62 or 63 years of age.



The above is a systems evaluation developed by age-gender analyses of claims for all people.  This data is derived from a true population re-assessed for a theoretical total N of one billion approximately.  These are the numbers claims filed by these patients relative to their age.  This flaring out is a standard for claims analyses. The reasons are common sense–as one gets older, one visits doctors more often.    The two minor peaks in claims at 4-5 years of age may be a cultural standard related to pre-school and first and second year schooling experiences (need for an exam to go to school, along with exposure to other kids).  The 15-18 year old Male/Female peaks pertain to end of childhood related issues, end of schooling, inittation of sexual encounter, initiation of risk behaviors, etc. etc. Also notice the Females are on the left, and that they experience 1.5 to 2.0 times more claims related events than men of the same age.

Again, there is a lag in the increase of claims in men versus women, for the ages between 28 and 62.  Men hold off on visits, but in their later pre-retirement years are finally about equal in amounts of activities.


Interpreting Overall Membership Population by Regions


These pyramids were applied to regions of the United States defined using a specific set of rules based on experience (explained later).  The purpose here is to demonstrate how these pyramids can be applied to a big area of large population health database and result in better use of your complete data.

The first thing to notice about a population pyramid is that it has a “wing” like pattern.  This peak is at the retirement year age.  There is a significant decrease in patients who are 65 years of age or older due to changes in their health care plan upon retirement.   This anomaly is due to eligibility features, and has nothing to do with mortality, morbidity or the like.  Usually a good indicator of old age population health is the Florida are on the above regional map.  Notice that wing-like form is nearly absent in this demographic age-gender pyramid due to old age retirees living in this region.

Another important broad age range to pay attention to is for the female population (pink, right side) of childbearing years.  This can be as broad as 13 to 64 years of age in theory, but is most important to watch for the 13 to 54 years of age individual.  On a population pyramid, it “bulges out” (no pun intended) between the ages of 17 and 45 for areas with large fertility rates and family sizes.  On the map above we see this peaking for the Rocky Mountain area.  In later examples, where the regions are better subdivided, we see small sections of the country where child-bearing is more prominent than in neighboring regions.

Part 3.  General Patterns.

Chronic Diseases


This example details 3 different chronic disease types.   The chronic diseases we normally hear about involve people in their middle or midlife age.  These diseases may start earlier in life but peak during the peak years of their life.  The most common examples of these noted above a Diabetes, Gastroesophageal Reflux Disease, and Irritable Bowel Syndrome.  The tend to demonstrate two or three very characteristic peaks: the first peak is a diagnosis/rule out related event using the later teen years, with claims/activities reducing in their 20s, due to changes in health care access for the most part, but also accompanied by the “I feel good” or “I can take care of it on my own” attitude.

The second peak is therefore during those midlife years, usually after the age of forty, when chronic disease problems begin to set in.  For diseases with a strong psychosomatic association, such as GERD and IBS, notice that men peak much earlier than women, usually around the age of 30.  Women on the other hand for certain diagnoses don’t peak until they are in their 40s.

There is a subsequent decline in cases or prevalence noted between that peak for men and women and the retirement years. Following retirement, with re-enrollment in new health care plans, the third peak is generally seen so long as the illness, condition or disease has not become fatal by this stage in life.

Mid-Age Onset Disease Patterns



Mid-Age Onset Early Diminishment or Fatality


Prevalence.  Men on the left, women on the right.

Compare the gender asymmetries.  The top form represents the list to its immediate left.  The middle outcome is for Atopic Dermatitis and the third for Portal Hypertension.


Lifelong Conditions or condition that are not age-dependent regarding onset


Prevalence.  Men on the left, women on the right.

The left graph depicts the most common scenario for a non-fatal disease of this class.  The teen age-twenty year old peak include rule out claims, thus this early peak, the midlife peak is due to onset of symptoms or generation of first diagnosis/rediagnosis, and the third peak related to aging and health, with increases in care received for other diagnoses as well increasing likelihood for diagnosis and/or recording this diagnosis..

The right graph would be for something like Thalassemia.  Some inherited diseases are potentially fatal in early childhood, and then recommence their attack on the body during its final years.


Progressive End-Stage or End of Life Diseases


Prevalence.  Male is on the left, female on the right.

These demonstrate a “mushrooming effect”.  As the individual ages, the more likely to disease is to be found.  What’s important here are two things:

the presence of a small childhood peak, perhaps due to rule outs because of genetic history, and the age when the increase in number of cases or prevalence increases with a greater slope.  Some end of life diseases initiate onsets when people are in their 50s, others for people in their 60s, still others for people in their 70s, etc.  The top three pyramids depict three examples of different age of onset curves, with one demonstrating high gender specificity.


Part 4.  Special Classes of Disease Patterns


Disease patterns can differ due to spatial features.  We expect for example heat exhaustion to be worse in a southern latitude, or for tuberculosis to be less at high elevations, or for tick-related diseases to be more prevalent in tick-ridden ecological zones.  For the most part this is the case.  Occasionally there are patterns that appear which require more targeted research to determine why they happen.

The above pyramids demonstrate examples of these unexpected outcomes.

Adjunct Hemolytic Anemia is a condition induced by cold climates.  For this review, an example of the coldest region, New England, is compared with one of the hottest regions, the lower Mid-west (Texas and vicinity).  New England demonstrates this to be a primarily age-related diagnosis with prevalence increasing as we get older.  The Lower Midwest states demonstrate bimodalism, with a large peak in childhood, and a second peak in the retirement age range, but also a consistent retention of diagnosis rates for most of the age ranges of the remaining population.  For the older population, females express this condition more than men; there is no gender difference seen elsewhere in these pyramids.

Heat Exposure is a diagnosis related to events that relate to the title.  Upper Midwest and Tristate (NY-NJ-PA) areas are compared.  For both, men are greater than women.  The male population demonstrates several very distinct narrow age peaks.  In the midwest, more cases are presented and these peaks are greater in number.  For both, females are much less than men.  These diagnoses could be work and/or activity related, such as due to outdoor activities in work or recreational form.  Personal behaviors and risky behaviors may also help explain these differences in terms of men versus women.  Also note, men’s display of diagnoses are much greater than women for both regions, but between regions, women do not differ as much with each other as men do.

Temporal Lobe Epilepsy (TLE).  This third case demonstrates a combination of cultural and physiological biological behaviors and influences.  TLE is a form of epilepsy that has unique cultural attributes assigned to it.  The New York area versus the Lower Midwest are compared, the former being multiethnic and the latter being more distinct regionally due to a combination of local Tex-Mex cultural expectations and features, locally defined environmentally related risks for seizure onset (heat-induced, and physiological ionic imbalance induced seizures), and general form and quality of life (stress) related features.

There is in addition a strong Mexican-Middle American cultural heritage that needs to be considered in this region.  Just across the border, the prevalence of seizures increases greatly, suggesting culture and social behaviors may play a role in prevalence, by effecting reporting rates and/or impacting quality of life and longevity of people with this condition.

The small peak in the older teenage-early 20s age range has some other important factors to consider.  This small peak is seen for numerous mental health diagnoses as well.  This  suggests there is a reason for these two diagnostic/treatment peaks, which may be due to socially specific gender related expectations, such as males being given a second chance before visits that end up with a diagnosis are considered an option by the parents.

Epilepsy in general has an age-related behavior pattern, with greatest likelihood for diagnosis seen in older teen the young adult age groups.  The diminishment in cases as age increases can be due to mortality rates increasing as age increases, but also related to outgrowing these events, or the likelihood that many ICDs coded for these cases are rule outs, based on age related factors.



Part 5.  Specific Disease Patterns.

Genetic Disease


The complexity of inherited disease patterns reveals much about how we should develop our programs for managing these patients.  There is this general tendency for us to compartmentalize certain diagnoses in medicine and genetic diseases form a class of diseases that are often compartmentalized to such an extent that the only public concerns communicated for these condition come around each year, whenever and “Easter’s Seal” like event is scheduled to be held.  Living in the Hudson Valley, I see signs of this expulsion of the genetically disabled from most of society each time I pass by a place that I know serves as a home for these individuals.  Many of the individual so served locally come from original families located closer to the Big Apple.  Because they are a financial burder to the residents of this charming place to visit, they are sent elsewhere to spend much of their life visually locked away from much of society, that is until the weekends arrive in the summertime, when they can hop on a bus and spend a half day or so at the County Fair, attending it for free on those days when most of the locals aren’t there.

The two diseases noted above have very strong culturally focused series of criticisms or arguments written about them in general.  Many people in the medical world think but do not voice their personal question – ‘why do such diseases still exist when in theory they are preventable through genetic testing and such?’

In the figures above, the right graph is numbers of people and the left percentages or prevalence.  Note how Thalassemia is distributed equally between genders as people with the disease progress in age, and that such is not the case for sickle cell carriers and those with actual sickle cell disease.  Also notice the different asymmetries displayed for those with the disease versus those which are carriers, based on the left pyramids (again, male is on the left, female on the right). Females tend to demonstrate a delayed diagnosis during their child-bearing years, slight, but it is there.  They also show slightly more prevalence that lasts longer into the fertile period, and at the age of 40 there are still surviving, increasing the likelihood for this disease to be passed down.

Sickle Cell carriers display an obvious asymmetry with men far more likely to bear it in their early midlife period.   For women, for most of their years, it does not demonstrate much in regard to peaks and valleys regarding age-specific prevalence.   Although sickle cell is a somatically passed genetic trait not born by the gender specific XX-XY genes, gender differences are clearly displayed for this diagnosis.

The other fact to keep in mind is that sickle cell diagnoses can occur with people who display a partial expression of this trait.  That is not so easy to relate to the above display of age-specific counts.


Atrial Fibrillation versus Stroke


Numbers – Statistical Significance Peaks.  Men – left, women – right.

In a prior review it was demonstrated that two very different regions of the United States worthy of special attention for comparison purposes are New England, and Florida with its neighboring states.

New England displays a tendency for late age adult onset diseases to display 20 years earlier within its population, at least according to records indicating this diagnosis to be included in the health review.  The fact that older age disease patterns is striking the New England group so early suggests that they are either diagnosed earlier due to receiving more forms and varieties of health care than people in Florida, and therefore being diagnosed much earlier, or that they are expressing these various disease patterns at a much earlier age due to life style and occupation related reasons.

The latter line of reasoning tends to be more likely.  Since Florida consists of a much higher old age population, it seems ludicrous to think that health care in this region would miss these diagnoses for whatever reasons.  In addition, the rest of the country lacks this mid-age to pre-retirement display of disease patterns, as seen in Florida and New England, and so confers more with the likelihood that New England has a unique medical from all the rest of the country.

This tendency for early onset of diseases with high risk to long term health is supported by about 40 to 50 diseases based on the ICD classifications used for this analysis.  The above population health pyramids depict two of these diagnoses to demonstrate this unique finding–Atrial Fibrillation and Stroke.

With atrial fibrillation, one can usually survive a while without much consequence of the fibrillation, although the risk for complications related to other forms of arrhythmia onset that are deadly or the development of clotting related problems can be a long term consequence.    With cardiovascular accident or stroke, there is a greater likelihood of fatality once the event happens, but for those who survive, there are complications and repeated events that become a threat to life.

For both of the above US regions, comparisons are made to the overall national averages and related population pyramid.  Each region is evaluated for n or prevalence regionally versus nationally, and a  method of assigned risk for the variation of each age value (per year) and its deviation from the national numbers is evaluated.  This algorithm tests the numbers in much the same way that facial recognition equipment is used to differentiate small details such as nose and eyebrow length, protruding chin, height of ear bottom relative to lip, etc.   It’s sensitivity is then set to display values once a certain critical number is reached.  Bars pointed right indicate that the test population has statistically more prevalence that the national averages.  Bars pointing left indicate national averages are more than the region at a statistically significant level.

For the Florida population, we see, as expected, older people demonstrating greater prevalence that is statistically significant from the ago of 65 on up for both ICDs.  This is noted for both men (red) and women (blue), with men three times greater than women.

In the New England results, we see the same, but for a much younger age group.  For atrial fibrillation, men and women have shared peak ages of diagnosis.  Both have displayed diagnoses beginning around 39 years of age, peaking briefly, then dissipating, for a statistically significant re-emergence in diagnosis taking place, for men mostly in their mid 50s.  For stroke, men demonstrate an onset years earlier than women and far more often that women, whose larger numbers and frequency are statistically significant only for a very short age band.


A number of special topics were reviewed over the years as well using this methodology of disease research.

The following areas of concentration were pursued as well, which hopefully with time I can post the results and findings for.  Links will be added as these are posted. (This is going on ten years of work, in increments, done alongside numerous other projects, so please be patient.)

  • Suicide, by age, gender, form of suicide
  • Infibulation practices over the span of ten years, per year
  • Hypertension
  • Diabetes
  • Epilepsy


My statement for the day . . .

You cannot put a stockbroker out on prairie lands and expect him to herd cattle, much less buyers and consumers.  Likewise, you can’t put a highly trained and skilled business analyst into the medical field or medical QI field, and expect him or her to deliver a higher quality service, increase average lifetime expectations, or demonstrate valuable health-related improvements when compared with the population health analyst.  Cost and Care do not go hand in hand, as much as these people like to think they do.  It costs money to improve quality of life for older age people.  In the long run this saves a lot of money, savings which none of current  leadership, management, CIOs, CFOs, or CEOs are ever going to see unfortunately, or be willing to invest in and thereby take a risk.