Categories
Blogs

Judgment Time: An Analysis of the Department of Child and Family Services Algorithms

By: Taylor Toepke

At 3:50 p.m. on the Wednesday after Thanksgiving 2016, Timothy Byrne, a call screener at Pittsburgh’s hotline for child abuse and neglect, decided to key in “Low risk” and “No safety threat” when selecting the likely threat to a child’s immediate safety. This happened after a student told a preschool teacher that a friend of her mother’s was in her home when “he hurt their head and was bleeding and shaking on the floor and the bathtub” (Hurley 2018). Byrne decided to search the department’s computer database, and he found allegations of abuse that dated back to 2008, some of which included parental substance abuse, domestic violence, and medical neglect and sexual abuse by an uncle involving one of the girl’s two older siblings (Hurley 2018). Thousands of calls, thousands of reports, and still thousands of children die each year as a result of abuse and neglect.

Algorithms and big data are revolutionizing the way problems and day-to-day tasks are handled in our current world, allowing new insights to solve a range of problems including providing credit reports, stock recommendations, and now it’s being used to assess at risk children. It’s a radical change to allow an algorithm and big data to determine what decision should be made when it comes to whether a child needs immediate attention and to be taken out of their home. The Department of Child and Family Services (DCFS), as well as individual social workers, face enormous pressure in making such judgment calls, and these decisions can be life-changing for the children and families involved. Every day, social workers wonder if the children that they allowed to remain at their home the day prior will be safe and un-harmed. Given the predictive power of algorithms, the question arises about whether and how DCFS could leverage data science to determine which children are at risk and which children aren’t?

In 2013, Gabriel Fernandez, an 8-year-old boy in Los Angeles, was tortured and killed by his mother and his mother’s boyfriend. Netflix even created a series based on the events that took place in the Fernandez home. Despite four social workers and numerous police officers knowing about his situation, along with a teacher who had filed reports with DCFS, none of them removed him or made him visit a doctor (Young 2020). According to Julius Young from Fox News, “Numerous parties made attempts to highlight what was going on in the home; however, pleas by his teachers… fell on deaf ears.” Because social workers knew about the abuse prior to his death, this case received significant attention with a focus on the under-funded DCFS system that allowed yet another child to die.

You might be wondering how things went so wrong. How could a system be so broken and fail a child who was stuck in a home, unloved, and unheard? According to Dan Hurley from the New York Times, “Nationally, 42 percent of the four million allegations received in 2015, involving 7.2 million children, were screened out, often based on sound legal reasoning but also because of judgment calls, opinions, biases and beliefs” (Hurley 2018). Human error is inevitable and is manifest in many situations. Algorithms can limit this error, but do they take out the human?

This is where judgment comes into play. Bias, prior experiences, and pre-determined views can impact judgment and how people perceive situations. I’m not saying that the four social workers were right in their actions involving the Fernandez case, but I’m not saying they were wrong either. Human error is inevitable, but will reliance on algorithms alone prevent such error?

In 2016, Allegheny County in Pittsburgh became one of the first jurisdictions to allow a predictive-analytics algorithm to offer a second opinion on every incoming call. The screening tool displayed a vertical bar indicating 1 (lowest risk) and 20 (highest risk). “The assessment was based on statistical analysis of four years of prior calls, using well over 100 criteria maintained in eight databases for jails, psychiatric services, public-welfare benefits, drug and alcohol treatment centers and more” (Hurley 2018). The screening tool allowed Byrne, the call screener mentioned earlier, to dig up all the information about the mother, which typically would take hours. That same night, the algorithm predicted the 3-year-old to be at high risk, and officials were called in. However, Hurley’s article stated that the child was not at risk, even though the algorithm gave the child a 19 out of 20 score. The algorithm used was based on data constructed by social workers and DCFS officials who come in with their own pre-determined views and biases, creating inaccurate data. In this case, even though the algorithm did not make an accurate prediction of risk, perhaps erring on the side of caution – flagging possible abuse even if it’s not happening – is preferred to overlooking abuse when it happens.

Evaluating algorithms entails looking at how the algorithm was designed, the data behind the algorithm, as well as the humans who created it. Eckerd Kids, a Clearwater (Florida)-based nonprofit that runs child welfare programs in some states across the country, were contracted in Illinois to develop a “web-based program to pinpoint abuse and neglect investigations with the highest probability of series injury or death to children” (Capitol Fax 2017). The algorithms that Eckerd created rated the children’s risk of being killed or injured within the next two years, creating thousands of alerts and ending up in 4,100 Illinois children being assigned a 90 percent or greater probability of death or injury (Capitol Fax 2017). Despite the number of cases assigned a serious score, children still fell through the cracks. One child who did not get a high-risk score, and therefore was not flagged for the social workers, was found dead in her home in Joliet. Children fall through the cracks regardless of human interaction or the lack thereof. Without “clean”, honest, and un-biased data, there is no way to accurately predict whether a child is at risk.

Eckerd’s algorithm was said to “disproportionately select poor children of color for government intervention, critics warn, and automated decision-making may replace the judgment of experienced child welfare professionals” (State Capital 2017). In order to create a model that correctly determines at- risk children, Eckerd needed past data. Eckerd analyzed thousands of closed abuse cases to draw data points that are highly correlated with serious harm. “The parents’ ages could be a factor – or their previous criminal records, evidence of substance abuse in the home, or the presence of a new boyfriend or girlfriend” (State Capital 2017). Each variable was highly considered when creating this algorithm; however, DCFS’s automated case-tracking system was riddled with data entry errors and did not link investigations about many children to cases regarding their siblings or other adults. Each night, DCFS gave Eckerd a nightly “data dump” from their case-tracking system, allowing Eckerd to generate real-time scores for each child (State Capital 2017). However, if the system and the data that an algorithm is relying on is broken or “damaged”, how accurate can that algorithm be? Algorithms are said to be un-biased and avoid faulty decision making that humans exhibit, but as stated by Hurley, “But what if the data they work with are fundamentally biased? There is widespread agreement that much of the underlying data reflects ingrained biases against African Americans and others” (Hurley 2018). Humans are ultimately behind each police report, DCFS report, and the algorithm itself.

The data behind each case is “hand-written”, configured with human bias and judgment about each individual case. Social workers and detectives go through hundreds or thousands of cases each year. They see the different children and parents and the abuse present in those households. However, they bring those past experiences and knowledge to the following case they take on. They may perceive very abusive situations as not as serious as a previous case they encountered. Each case is not independent of one another, it is simply another case that is stacked up in the DCFS filing cabinet and may go untouched or unnoticed by an algorithm that is proven not to be accurate 100% of the time.

Blame is a common controversy when it comes to failing to identify at-risk children. The four social workers who played a role in the Gabriel Fernandez case were just a few of the people who were blamed for the inaction that ultimately led to Gabriel’s death. However, blame for his death cannot be placed solely on the social workers nor the detectives; the system itself is also to blame. The data people rely on is a part of the system; it was encoded and written by trained social workers, with human bias and judgment. In like fashion, to blame the algorithm for poor predictions, one must blame the data. And to blame the data, you must blame the system.

There is no such thing as a perfect algorithm that can deliver a 99.99% accuracy rate. The best way to ensure that each case and each child has a fair chance of receiving accurate interventions, social workers and the algorithm must go hand-in-hand. The algorithms are created not to replace humans and their decision making, but to support and guide their decisions, to reduce human error and bad judgment. The algorithm can provide a good basis for identifying the medium-to-high risk kids, AND the social workers must analyze each flagged case to determine if further actions are needed. Each case must be treated on its own facts, independently of past impacts or decisions on similar cases. The data must be consistent, monitored, and reviewed by more than one social worker.

This is not an easy task and the majority will probably say it’s impossible because of the high volume of cases that come in each week, each month, each year. Whether it be one child or a thousand, each one deserves a chance to be saved from abuse, harm, and neglect.

References

CAPITOL FAX. “DCFS says it’s scrapping predictive analytics program”. State Capital Newsfeed The Capitol Fax Blog( Illinois), December 6, 2017 Wednesday. advance-lexis-com.weblib.lib.umt.edu:2443/api/document?collection=news&id=urn:contentItem:5R44-J1Y1-DXKX-02HP-00000-00&context=1516831. Accessed April 7, 2020.

HURLEY, DAN. The New York Times Magazine. The Toronto Star, January 14, 2018 Sunday. advance-lexis-com.weblib.lib.umt.edu:2443/api/document?collection=news&id=urn:contentItem:5RDD-44C1-DY91-K0PJ-00000-00&context=1516831. Accessed April 3, 2020.

STATE CAPITAL. “Data mining program designed to predict child abuse proves unreliable, DCFS says”. State Capital Newsfeed Illinois Senate Republican Caucus, December 7, 2017 Thursday. advance-lexis-com.weblib.lib.umt.edu:2443/api/document?collection=news&id=urn:contentItem:5R4F-VNW1-DXKX-03T1-00000-00&context=1516831. Accessed April 3, 2020.

YOUNG, JULIUS. “Netflix’s ‘Trials of Gabriel Fernandez’ Is a ‘Case-Study’ into Nationwide Government Secrecy, Producers Say.” Fox News, FOX News Network, 4 Mar. 2020, http://www.foxnews.com/entertainment/netflix-trials-gabriel-fernandez-government-secrecy-producers.