|
Evaluations of the SDM Model Few case management systems used in child welfare have been subjected to as much empirical scrutiny as the SDM model. This section summarizes some of the most salient evaluation research conducted in relation to SDM during the 12 years in which the model has been used in child welfare. The SDM system has been examined on two primary levels:
The next section presents the results of studies that have examined the reliability and validity of SDM risk assessment tools. This section is followed by a summary of impact evaluations conducted to date in Michigan. Evaluation of Research-Based Risk Assessment CPS agencies have traditionally relied on clinical judgment to establish the risk levels of families served by the system. However, recent research (Rossi, Schuerman, and Budde, 1996) has demonstrated that clinical decisions regarding the safety of children vary significantly from worker to worker, even among those considered to be child welfare experts. Moreover, although consensus-based risk assessment tools were developed to enhance consistency among caseworkers and improve decision making, recent studies indicate that the reliability and validity of these instruments are well below accepted standards (Baird et al., 1999; Baird and Wagner, 2000; Falco, in press). Comparative reliability and validity of different risk assessment models. A study that CRC recently completed for the U.S. Department of Health and Human Services’ Office of Child Abuse and Neglect (OCAN) compared the reliability and validity of three risk assessment tools. Two were consensus-based tools (the Washington State model and the Fresno, CA, risk assessment,3 which is a derivative of the Illinois CANTS system). The third was a research-based instrument (the Michigan version of SDM risk assessment). To measure the reliability of the models, 80 randomly selected cases were assessed by 4 case readers using the Washington model, 4 using the Fresno model, and 4 others using the Michigan tool. Two measures of reliability were examined: the percentage of cases in which raters reached the same conclusion about risk level and Cohen’s Kappa, a statistical measure of reliability.4 The results for the research-based Michigan risk tool showed that at least three of the four raters agreed on the risk level for 85 percent of the cases (Baird et al., 1999) (see figure 11). However, this level of agreement was obtained for only 45 percent of the cases with the California scale and 51 percent of the cases with the Washington risk tool.5 According to Cohen’s Kappa, which was computed for each set of raters, the SDM system (Michigan) was again deemed far more reliable than the two consensus-based systems (see figure 12).
Because risk assessment tools attempt to classify families according to the likelihood of future maltreatment, the validity of these tools can be demonstrated by showing a significant increase in subsequent maltreatment for every increase in risk level. According to this criterion, the OCAN study found superior validity in the research-based tool. In a sample of more than 1,400 cases from California, Florida, Michigan, and Missouri, the SDM research-based risk assessment tool categorized families into groups with significantly different risk levels. The families classified as higher risk had many more subsequent substantiations of maltreatment than the families classified as lower risk. The consensus-based tools, on the other hand, sorted families into risk levels that had little correlation with actual outcomes (Baird et al., 1999). These findings are shown in table 1. For each risk model, the table shows the percentage of cases at each risk level that had subsequent investigations and substantiations during an 18-month followup. The data show that for both outcome measures there was little difference between families classified as moderate risk and families classified as high risk by the California model. Further, there was little difference in subsequent substantiations for the families classified at each risk level on the Washington model. In contrast, there were significant differences in outcomes by risk classification when the Michigan model was used to assess the families.
* The California risk assessment involved in this study was a consensus-based tool that predated California’s implementation of SDM and research-based risk assessment.Source: Baird and Wagner, 2000. Research-based risk assessment and equity. Disproportionate numbers of minority children, particularly African Americans, are placed in foster care, and minority children spend more time in placement than their Caucasian counterparts (Hill, 2001). This disproportionality was the case long before SDM was introduced and remains a prevalent pattern, raising the issue of equity in CPS decision making. Because empirically based risk assessment tools use information related to poverty and other social conditions, some practitioners have questioned whether the instruments contribute to racial bias. Under the SDM system, however, foster care placement is guided by safety assessment, not risk. Risk level guides case opening and intensity of services decisions. Disproportionate representation of minority youth in foster care, therefore, should not be attributed to the use of research-based risk assessment tools. Nevertheless, it is important to measure how SDM risk assessments perform across racial and ethnic groups. Because equity is a key principle of SDM development efforts, every SDM risk tool validated to date has been subjected to an examination of its validity within racial and ethnic populations. These tests have shown that the use of SDM instruments results in virtually equal assignment of all races and ethnicities to each risk level. Table 2 presents data from Michigan as an example of the level of equity SDM has attained. These data on more than 6,500 white and 5,000 African American families show that there is no disproportionate representation at any risk level (Baird, Ereth, and Wagner, 1999). The California Department of Social Services also conducted an independent, detailed analysis of the individual items incorporated in that State’s new research-based tool and an overall assessment of the tool’s equity. The analysis found no bias in any item or in the instrument as a whole (Johnson, 1999).
Equally important findings relevant to the issue of equity have come from Baird and his colleagues. They have found that the rate of subsequent maltreatment observed within racial and ethnic groups increases with each incremental rise in risk level and that maltreatment rates within each risk category are similar among all groups (Baird, Ereth, and Wagner, 1999). Evaluation of the Michigan SDM System Between 1989 and 1992, CRC and Michigan child welfare staff worked together to design an SDM system for CPS cases (Baird et al., 1995). When initially implemented, the system consisted of risk and needs assessment instruments, case planning and reassessment tools, and differentiated service standards. System implementation began in 13 pilot counties in 1992. Michigan’s phased implementation schedule for the system presented an opportunity to formally evaluate the impact of SDM by comparing outcomes in the 13 pilot counties with those in a matched sample of 11 counties still operating under the traditional system. The evaluation sample included all cases with substantiated reports of abuse or neglect between September 1992 and October 1993. The SDM and comparison study samples each consisted of approximately 900 families. Outcome measures included new referrals, investigations, and substantiations during a 12-month followup period. The evaluation revealed several important differences in decision making and case processing in the SDM and comparison counties. These findings are summarized in the sections that follow. Case closing decisions. The SDM counties were significantly more likely than non-SDM counties to close low- and moderate-risk cases following substantiation, and the non-SDM counties were more likely than SDM counties to close high- and intensive-risk cases. Moreover, cases that were closed without services in the SDM counties had significantly lower re-referral rates than those closed without services in the comparison group. This finding indicates that the use of risk assessment led to improved decisions in the SDM counties regarding which cases could be safely closed at the completion of the investigation. Program participation. Service program participation by families was significantly higher in the SDM counties than in the comparison countiesparticularly among high- and intensive-risk families. For example, high-risk families in SDM counties were more likely than those in non-SDM counties to receive parenting skills training, substance abuse treatment, family counseling, and mental health services (see figure 13). This outcome is likely a result of the clear identification (via the risk assessment) of these families as being more likely to reabuse or reneglect their children and the more consistent identification of existing problems (via the SDM strengths and needs assessment).
Outcomes. The evaluation also examined whether implementation of the SDM system resulted in a better overall system of child protection—and, in particular, lower rates of subsequent maltreatment. Families in the SDM and comparison counties were followed for 12 months to determine whether use of the SDM system resulted in lower rates of re-referral or resubstantiation. Figure 14 compares results for CPS cases in SDM counties with those in non-SDM counties. For every outcome measure, families in the SDM counties had better results than families in the comparison counties. The greatest difference was in rates of subsequent maltreatment substantiations: that rate was 50 percent lower in SDM counties than in non-SDM counties (6.2 percent versus 13.2 percent).
An analysis of outcomes by risk group also showed positive results for the Michigan SDM system. For example, high-risk CPS cases handled in the SDM counties had fewer subsequent referrals for maltreatment and fewer subsequent child injuries than high-risk cases in the non-SDM counties. They also had lower rates of subsequent placement in foster care and were only half as likely to have a subsequent maltreatment substantiation. Summary. The results of this carefully controlled evaluation show not only that SDM resulted in important changes in decision making and service provision for child welfare cases but, as anticipated, that it ultimately had a positive impact on the protection of Michigan’s children. Although the rigor of the Michigan study has not been duplicated in other agencies, data from Wisconsin counties seem to support the Michigan findings. In Wisconsin, high- and very high-risk cases that were opened for services had much lower rates of subsequent reports of maltreatment than cases at similar levels of risk that did not receive child protection services.6 As table 3 illustrates, high levels of intervention in these cases lowered the rate of subsequent reports dramatically. At the same time, services had negligible effects on low- and moderate-risk families (Wagner and Bell, 1998). Thus, data from both Michigan and Wisconsin indicate that accurate identification of families with the greatest potential for subsequent maltreatment, together with appropriate allocation of resources, can play a significant role in protecting children from harm.
3 The California risk assessment tool in this study was a consensus-based model that was used prior to the implementation of SDM. 4 Cohen’s Kappa is essentially a measure of the extent to which raters agree in their assessment of cases beyond that which would occur by chance alone. As Baird and Wagner (2000:738–739) point out, “There is no definitive Kappa threshold that designates an acceptable level of reliability, but Kappas below .3 generally indicate a very weak level of reliability. Although researchers vary on what is considered adequate, a Kappa above .5 to .6 is generally deemed acceptable.” For a full description of Cohen’s Kappa, see Rossi, Schuerman, and Budde, 1996:16–17. 5 When the criterion was 100-percent agreement among the raters (i.e., all four agreed on the risk level), the Michigan instrument also significantly outperformed the two consensus-based instruments. The raters using the Michigan scale had perfect agreement for 58 percent of the cases; however, the raters using the California scale all agreed on only 29 percent of the cases and those using the Washington scale all agreed on 39 percent of the cases. 6 In these Wisconsin counties, all investigated cases are assessed on risk. However, a large proportion of cases are not opened for services because the current allegation has not been substantiated. Hence, this analysis compares the outcomes of substantiated cases at each risk level with those of unsubstantiated cases at each risk level. The comparison is appropriate because the risk assessment instrument has been validated on both the substantiated and unsubstantiated populations.
|
|||||||||||||||||||||||||||||||||