Title: Batterer Intervention Programs: Where Do We Go From Here?
Series: Special Report
Author: Shelly Jackson, Lynette Feder, David R. Forde; Robert C. Davis,
Christopher D. Maxwell, and Bruce G. Taylor
Published: National Institute of Justice, June 2003
Subject: Domestic violence, batterers, program evaluation
41 pages
113,000 bytes

Figures, charts, forms, and tables are not included in this ASCII plain-text file.
To view this document in its entirety, download the Adobe Acrobat graphic file
available from this Web site or order a print copy from NCJRS at 800-851-
3420 (877-712-9279 For TTY users).

----------------------------

Batterer Intervention Programs: Where Do We Go From Here?

U.S. Department of Justice
Office of Justice Programs
810 Seventh Street N.W.
Washington, DC 20531

John Ashcroft
Attorney General

Deborah J. Daniels
Assistant Attorney General

Sarah V. Hart
Director, National Institute of Justice

----------------------------

Batterer Intervention Programs: Where Do We Go From Here?

Shelly Jackson, Lynette Feder, David R. Forde, Robert C. Davis, Christopher
D. Maxwell, and Bruce G. Taylor

NCJ 195079

----------------------------

Sarah V. Hart
Director

Findings and conclusions of the research reported here are those of the authors
and do not reflect the official position or policies of the U.S. Department of
Justice.

The studies discussed in this report were supported by the National Institute of
Justice under grants 94-IJ-CX-0047 and 96-WT-NX-0008.

The National Institute of Justice is a component of the Office of Justice
Programs, which also includes the Bureau of Justice Assistance, the Bureau of
Justice Statistics, the Office of Juvenile Justice and Delinquency Prevention, and
the Office for Victims of Crime.

----------------------------

About This Report

Batterer intervention programs were introduced as a way to hold batterers
accountable without incarcerating them. Initial studies suggested that the
programs reduced battering. Two evaluations of programs in Broward County,
Florida, and Brooklyn, New York, based on more rigorous experimental
designs, claim that they have little or no effect.

There are two possible explanations for these findings. One is that the
evaluations may be methodologically flawed; the other is that something may be
wrong with the programs themselves. This report analyzes both possibilities and
suggests directions for future policy and research.

What did the researchers find? 

In the Broward County study, no significant differences were found between
batterers in the treatment and control groups on reoffense rates or attitudes
toward domestic violence. In the Brooklyn study, the results were more
complicated: Men who completed an 8-week treatment program showed no
differences from the control group, but men who had completed a 26-week
program had significantly fewer official complaints lodged against them than the
control group. No difference was found among the three groups in attitudes
toward domestic violence.

What were the studies' limitations?

In both studies, response rates were low, many people dropped out of the
program, and victims could not be found for subsequent interviews. The tests
used to measure batterers' attitudes toward domestic violence and their
likelihood to engage in future abuse were of questionable validity. In the
Brooklyn study, random assignment was overridden to a significant extent,
which makes it difficult to attribute effects exclusively to the program.

Who should read this report?

Administrators of batterer intervention programs, advocates, and researchers.

----------------------------

Contents

About This Report

Batterer Intervention Programs 
Shelly Jackson

The Broward Experiment
Lynette Feder and David R. Forde

The Brooklyn Experiment
Robert C. Davis, Christopher D. Maxwell, and Bruce G. Taylor

Analyzing the Studies
Shelly Jackson

----------------------------

Batterer Intervention Programs

Shelly Jackson

With the establishment of proarrest policies in the 1980s, increasing numbers of
batterers were seen in criminal courts across the country. Initially, they were
sentenced to jail. Some victims, however, began to say that although they
wanted the battering to stop, they did not want their partners incarcerated. To
respond to these requests while still holding batterers accountable, offenders
were referred to batterer intervention programs (BIPs, also known as spouse
abuse abatement programs or SAAPs).[1] This has led researchers and
advocates to ask, "Do these programs work?" 

Although early evaluations suggested that BIPs reduce battering, recent
evaluations based on more rigorous designs find little or no reduction. The
methodological limitations of virtually all these evaluations, however, make it
impossible to say how effective BIPs are.

This NIJ Special Report describes the results of two recent evaluations that add
to this growing literature. Lynette Feder and David Forde in Broward County,
Florida, and Robert Davis, Bruce Taylor, and Christopher Maxwell in
Brooklyn, New York, conducted experimental evaluations of programs based
on the Duluth model (see "Types of batterer intervention programs"). The
Brooklyn evaluation found some reductions in battering, but it found no
evidence that the program had any effect on batterers' attitudes. The Broward
County evaluation found no change in either behavior or attitudes. These
evaluations are described in detail below. 

Types of batterer intervention programs 

The first BIP models were psychoeducational programs. One such program,
the Duluth model, is based on the feminist theory that patriarchal ideology,
which encourages men to control their partners, causes domestic violence. The
Duluth model helps men confront their attitudes about control and teaches them
other strategies for dealing with their partners. This model is the most common
form of BIP in the Nation; many States mandate that BIPs conform to the
Duluth model.

There are several alternatives to the Duluth model.[2] Cognitive-behavioral
intervention views battering as a result of errors in thinking and focuses on skills
training and anger management.[3] Another model, group practice, works from
the premise that battering has multiple causes and therefore combines a
psychoeducational curriculum, cognitive-behavioral techniques, and an
assessment of individual needs.[4] Examples of these programs include Emerge
and AMEND (Abusive Men Exploring New Directions).[5] 

Programs based on batterer typologies or profiles--most commonly
psychological and criminal-justice-based typologies[6]--are gaining
popularity.[7] BIPs based on these profiles are just beginning to be developed
and have not been evaluated.[8] 

Another, more controversial, intervention is couples therapy. This model views
men and women as equal participants in creating disturbances in the
relationship. Although couples therapy may be appropriate for some people, it
is widely criticized for inappropriately assigning the woman a share of the blame
for the continuation of violence.[9] 

Review of batterer intervention program evaluations

More than 35 BIP evaluations have been published. Early studies, which used
quasi-experimental designs, consistently found small program effects; when
more methodologically rigorous evaluations were undertaken, the results were
inconsistent and disappointing.[10] Most of the later studies found that
treatment effects were limited to a small reduction in reoffending,[11] although
evidence indicates that for most participants (perhaps those already motivated
to change), BIPs may end the most violent and threatening behaviors.[12] The
results, however, remain inconclusive because of methodological flaws in these
evaluations.[13] 

Although most of the programs evaluated followed the Duluth model, cognitive-
behavior therapy has also been examined. In 21 of 24 controlled studies,
reoffense rates were lower among program participants than among the control
group (although not all differences were statistically significant).[14] These
effects were larger in demonstration programs (implemented by a researcher)
than in practical programs (implemented by a juvenile or criminal justice
agency) or a combination of the two. This suggests that the way a program is
put into practice (i.e., how faithful it is to the intervention model) may be key in
determining its impact. Outcomes were measured only for an average of 20
weeks after the end of treatment, which did not allow an assessment of longer
term reoffense rates. 

Differences in evaluation methods account for much of the inconsistency in
findings. Pure experimental designs, favored by researchers because of their
methodological rigor, make finding true effects easier and reduce the likelihood
of error but are challenging to carry out in the field; as a result, design flaws may
cast doubt on the results. Quasi-experimental designs, which differ from pure
experiments in that group assignment is not random, are easier to carry out but
are more open to misinterpretation. Thus, it is hard to tell whether program
effects are true or masked because the evaluation was compromised in the
field. 

The next two chapters present the results of recent BIP evaluations in Broward
County, Florida, and Brooklyn, New York. Both studies used classical
experimental designs: Batterers were randomly assigned to experimental or
control groups. In Broward County, men in the experimental group were
sentenced to 1 year of probation and 26 weeks of group counseling at a BIP,
whereas men in the control group were sentenced to 1 year of probation. In
Brooklyn, due to circumstances discussed in detail in the chapter devoted to
that study, some men in the experimental group received their treatment in 26
weekly sessions, while others attended longer, twice-weekly sessions for 8
weeks. Men assigned to the control group took part in a community service
program. In both studies, the two groups were tested to see whether treatment 
had changed their attitudes toward violence. Recidivism was measured both by 
official measures and by victim reports of abuse. In Broward County, offender
self-reports of abuse were also recorded.

The Broward County study found no significant difference between the
experimental and control groups in attitudes toward the role of women, whether
wife beating should be a crime, or whether the State had the right to intervene in
cases of domestic violence. It also found no significant difference between
groups in victims' perceptions of the likelihood that their partners would beat
them again. Official measures followed the same pattern: No significant
difference was found between groups in violations of probation or rearrests. In
fact, men assigned to the experimental group were more likely to be rearrested
than members of the control group unless they had attended all of the treatment
sessions.

In the Brooklyn study, initial findings showed that the experimental group as a
whole was less likely than the control group to be arrested again for a crime
against the same victim. On a closer look, however, only the 26-week group
had significantly fewer official complaints than the control group at 6 and 12
months. The pattern of victim reports was the same (although the differences
between the 8- and 26-week groups were not statistically significant). The
study found no difference among the three groups in attitudes toward domestic
violence.

These studies, however, suffer from several limitations. Response rates were
low and sample attrition high in both studies. The measures of batterers'
attitudes toward domestic violence and likelihood to engage in further abuse are
of questionable validity. Random assignment to the control group was
overridden to a significant extent, especially in the Brooklyn study. Without
process evaluations, there is no way to tell how well the Duluth model was
being implemented in the treatment sites. These and other limitations and the
policy implications of the Broward County and Brooklyn studies are discussed
in detail in the final chapter.

Notes

1. Although this report discusses male batterers, women batter as well. It is
highly probable, however, that the dynamics of batterering differ for males and
females, which suggests the need for batterer intervention programs designed
specifically to meet the needs of female batterers. Currently, it appears that
most women batterers are being placed in male-dominated batterer intervention
programs.

2. Healey, K., C. Smith, and C. O'Sullivan, Batterer Intervention: Program
Approaches and Criminal Justice Strategies, Issues and Practices, Washington,
DC: U.S. Department of Justice, National Institute of Justice, February 1998.
NCJ 168638.

3. Babcock, J.C., and J.J. La Taillade, "Evaluating Interventions for Men Who
Batter," in Domestic Violence: Guidelines for Research-Informed Practice, ed.
J.P. Vincent and E.N. Jouriles. Philadelphia: Jessica Kingsley Publishers, 2000;
Lipsey, M., G. Chapman, and N. Landenberger, "Cognitive-Behavioral
Programs for Offenders: A Synthesis of the Research on their Effectiveness for
Reducing Recidivism," paper presented at the "Systematic Reviews of
Criminological Interventions" conference, Washington, DC, April 2-3, 2001.

4. Babcock, J.C., and J.J. La Taillade, "Evaluating Interventions for Men Who
Batter."

5. Healey, K., C. Smith, and C. O'Sullivan, Batterer Intervention: Program
Approaches and Criminal Justice Strategies.

6. Holtzworth-Munroe, A., and G.L. Stuart, "Typologies of Male Batterers:
Three Subtypes and the Differences Among Them," Psychological Bulletin
116(3)(1994): 476-97.

7. Healey, K., C. Smith, and C. O'Sullivan, Batterer Intervention: Program
Approaches and Criminal Justice Strategies.

8. Wexler, D.B., "The Broken Mirror: A Self Psychological Treatment
Perspective for Relationship Violence," Journal of Psychotherapy, Practice, and
Research 8(2)(1999): 129-41.

9. Babcock, J.C. and J.J. La Taillade, "Evaluating Interventions for Men Who
Batter."

10. See Babcock, J.C., C.E. Green, and C. Robie, "Does Batterer's Treatment
Work? A Meta-Analytic Review of Domestic Violence," Journal of Family
Psychology (under review); Davis, R.C., and B.G. Taylor, "Does Batterer
Treatment Reduce Violence? A Synthesis of the Literature," Women and
Criminal Justice 10(2)(1999): 69-93; Tolman, R.M., and J.L. Edleson,
"Intervention for Men who Batter: A Review of Research," in Understanding
Partner Violence: Prevalence, Causes, Consequences, and Solutions, ed. S.R.
Stith and M.A. Straus. Minneapolis, MN: National Council on Family
Relations, 1995: 262-73.

11. Babcock, J.C., and J.J. La Taillade, "Evaluating Interventions for Men
Who Batter."

12. Edleson, J.L., "Controversy and Change in Batterer's Programs," in Future
Interventions with Battered Women and their Families, ed. J.L. Edleson and
Z.C. Eisikovitz. Thousand Oaks, CA: Sage Publications, 1996: 154-169;
Gondolf, E.W., "Batterer Programs: What We Know and Need to Know,"
Journal of Interpersonal Violence 12(1)(1997): 83-98.

13. Healey, K., C. Smith, and C. O'Sullivan, Batterer Intervention: Program
Approaches and Criminal Justice Strategies.

14. Lipsey, M., G. Chapman, and N. Landenberger, "Cognitive-Behavioral
Programs for Offenders: A Synthesis of the Research on their Effectiveness for
Reducing Recidivism."

----------------------------

About the Author

Shelly Jackson is a program manager in NIJ's Office of Research and
Evaluation.

----------------------------

The Broward Experiment 

Lynette Feder and David R. Forde

Methodology 

Selection procedure 

This study took place in Broward County, Florida (encompassing Fort
Lauderdale), in the two courts charged exclusively with handling domestic
violence cases in that jurisdiction. It used a classical experimental design. All
men convicted of misdemeanor domestic violence in the county during a 5-
month period in 1997 were randomly assigned to experimental or control
groups.[1] The only exceptions were:

--Couples in which either defendant or victim did not speak English or Spanish.

--Couples in which either defendant or victim was under 18 years of age or the
defendant was severely mentally ill.

--Cases in which the judge allowed the defendant to move to another
jurisdiction at the time of sentencing and serve his probation through mail
contact. 

All other defendants (a total of 404) were assigned randomly to one of the two
groups. 

Random assignment 

Cases were randomly assigned based on the computer-generated court docket
number. The judge announced the assignment at the time the defendant was
adjudicated. Defendants were placed in the experimental group if their docket
number was even and in the control group if it was odd. This method allowed
the judges to carry out the process quickly, and it let researchers know when
assignments were not random.

Men placed in the experimental group were sentenced to 1 year of probation
and 26 weeks of group counseling sessions from a local BIP. Men placed in the
control group were sentenced to 1 year of probation only. At sentencing, the
judge referred defendants to one of five county-certified batterer treatment
programs, each of which used the Duluth model. The county's probation office
monitored compliance.

Outcome measures 

To capture the true amount of change in individuals undergoing court-mandated
counseling, researchers included measures from several sources. Batterers were
interviewed at adjudication and again 6 months later. Victims were interviewed
at adjudication and 6 and 12 months later. Valid, reliable standardized
measures were used whenever possible. Probation records and computer
checks with the local police for all new arrests were used to track defendants
for 1 year after adjudication. 

Hypothesis 

Although the study's ultimate purpose was to test whether court-mandated
counseling reduced the likelihood of future violence by convicted batterers, it
also was designed to test the theory that stake-in-conformity variables (e.g.,
age, a steady job, marriage to one's partner, a stable residence) could explain
when an intervention (an arrest or court-mandated treatment) would reduce the
likelihood of subsequent violence. This study began with two hypotheses:

--Batterers who were mandated to undergo counseling would be less likely to
beat their partners again than those assigned to the control group.

--Men with a high stake in conformity would be less likely to beat their partners
again than those with a low stake in conformity.

Batterer profile 

Age and marital status. Batterers participating in this study ranged from 19 to
71 years of age; the typical offender was 35 years old. Fifty-seven percent
were white, 36 percent were black, and 6 percent were Hispanic. Forty-five
percent of the batterers said they were married, 43 percent said they were
single, and 13 percent said they were separated or divorced. 

Education and economic status. Most of the men were long-term Broward
County residents who had lived there for an average of 160 months. Twenty-
five percent reported that they failed to complete high school; 9 percent said
they had graduated from college. Most of the men rented (67 percent) rather
than owned (33 percent) their homes. Seventy-two percent reported being
employed at the time of adjudication, with most of these saying that they had
been at their current job for 2 years or less. Forty-seven percent of the men
reported working in an unskilled or semiskilled position; 8 percent reported
working as officials and managers. Salaries ranged from $250 to $10,000 per
month.

Criminal record. Many of the men had a prior criminal record. Forty percent
had one or more misdemeanor arrests (averaging about 0.9 misdemeanor
offenses per individual), and 20 percent had one or more felony arrests
(averaging 0.3 prior felony arrests per offender). Many had been convicted and
jailed (44 percent had one or more jail stays) or imprisoned (7 percent had
been imprisoned at least once). For 85 percent of the men in the sample, this
was their first arrest for domestic violence. 

Police reports noted that approximately 28 percent of the incidents of domestic
violence for which the defendants had been convicted or adjudicated involved
alcohol; another 3 percent involved drugs. Victim injuries were recorded in 74
percent of the cases. These injuries most often were bruises (58 percent),
although 8 percent were severe enough to require the victim's hospitalization.
Men were taken into custody 99 percent of the time. 

Victim profile 

Age and marital status. A profile of the women involved in this study is drawn
from responses to the victim survey at the time of adjudication. Victims ranged
from 18 to 63 years of age; the typical victim was 34 years old. Women were,
on average, 2 years younger than their partners; age differences ranged from 23
years younger to 14 years older. About 53 percent of the women reported that
their husbands had battered them; 37 percent said that their live-in boyfriends
had battered them. Victims reported that they had been with the batterer an
average of 7 years.

Education and economic status. About 23 percent of victims said they had less
than a 12th-grade education; about 10 percent were college graduates. Forty-
seven percent said they were employed full-time, 19 percent reported part-time
employment, 11 percent said they were homemakers, and approximately 3
percent said they were unemployed and looking for a job. Of those who were
working, 63 percent reported they were in unskilled or semiskilled positions,
and almost 20 percent reported working in professional or managerial
positions. Women with better jobs may have been overrepresented in the victim
survey sample; 90 percent of these women reported that their husband or
boyfriend was working, whereas only 72 percent of the men in the sample
reported that they were employed at the time of adjudication.

Treatment delivery measures

Batterers in the experimental group usually were assigned to attend 26 group
counseling sessions over 26 weeks. A batterer who missed a session was
required to make it up. Almost 29 percent attended all the sessions, and
approximately 95 percent missed five or fewer sessions. Eventually,
approximately 66 percent attended all of the sessions; about 13 percent
attended no classes at all. Of the control group, 97 percent attended no classes.

Outcome measures 

Offender and victim interviews used several standardized scales to assess the
outcomes of the experiment. These included an abbreviated version of the
Inventory of Beliefs About Wife Beating and Attitudes Towards Women.
Batterers were also asked whether they believed that their battering should be
considered criminal, whether they thought they were responsible, and how
likely they were to batter again. The revised Conflict Tactics Scale (CTS2) was
used to assess their self-reports of verbal, physical, or sexual abuse within the
previous 6 months. Since men assigned to a BIP may not have attended any or
all of the sessions, or some not assigned may have attended on their own, data
were analyzed in terms of both treatment assigned and treatment received.

Victims were asked about the batterer's behavior, their beliefs about who was
responsible, and whether they thought another battering was likely. Offenders
were asked about self-reported partner abuse at time of adjudication and 6
months later. Victims were surveyed at time of adjudication, 6 months later, and
1 year later. At each point, survey data were analyzed for differences between
the experimental and control groups to see whether changes occurred over
time. 

Experimental integrity 

Given the problems inherent in running an experiment, the integrity of the
experiment as carried out must be addressed.

Outcome of random assignment. Statistical tests showed that the original
random assignments did not differ from chance.[2] Forty-two of the 446 cases
(9 percent) were dropped for failing to meet the criteria for inclusion, however,
and in another 14 cases (3.5 percent), judges placed men originally assigned
into the control group into treatment. This left a total of 188 men (43 percent) in
the control group and 216 men (57 percent) in the treatment group. The
likelihood of such a large split between the groups is very low.[3]

Equivalence tests at the time of adjudication found no significant differences
between the two groups in stake-in-conformity variables (criminal record, the
domestic violence incident for which they had been convicted or adjudicated,
or offender demographics), with one exception: The control group was 2 years
younger than the experimental group. Studies consistently have found that older
men are less likely to abuse their partners and to continue battering.[4]
Therefore, the observation that men in the control group were significantly
younger than those in the experimental group would make it easier to find how
effective treatment was.

Survey response rates. Individuals did not volunteer to be part of the
experiment, but they could not be interviewed without their consent. Although
all defendants who met the criteria were included in the sample, not all
defendants and their victims agreed to be interviewed. Many victims who did
not respond could not be located; on the other hand, many defendants simply
refused to be interviewed.

The low response rate reflects the charged environment in which the experiment
was conducted. Vocal opposition to the project led many who had supported
the research financially to take a step back. Although they did not actively
oppose the research, their failure to deliver their promised support (on which
the researchers relied) strained project resources and lowered response rates.
Response rates among defendants were 80 percent for the first interview and
50 percent for the second, 6 months after adjudication. Response rates for men
in the experimental and control groups were equivalent. Completion rates
among victims were even lower: 49 percent for the first interview, 30 percent
for the second, and 22 percent for the last. Victims of batterers in the
experimental and control groups had no significant difference in response rates.
Although low response rates are typical when working with victims of domestic
violence,[5] they present a limitation to this study. 

As one would expect, it was easier to track defendants' progress through
official measures. The research team collected and coded all probation folders
(and the information in them) at time of adjudication and coded all but one again
12 months later. As a further check, each defendant's name was run against the
computerized files from the county sheriff's office, which contained the records
for all arrests in Broward County. 

Integrity of experimental and control conditions. The literature gives examples
of "compensation," providing the control group with something extra to make
up for not receiving the intervention.[6] This threatens internal validity because
the control group is no longer a genuine control group (i.e., the two groups are
no longer comparable in all ways except that the experimental group receives
the intervention).

In this study, researchers tested for this possibility. Since judges had the
opportunity to order additional monitoring or supervision for the control group,
judicial orders for men in both groups were compared. Judges were found to
have assigned equivalent evaluations, supervision, and non-BIP treatment
programs to the men in both groups. Since the county probation office could
have more closely supervised the men in the control group, the two groups
were compared for the following:

--The number of months that they failed to report to the probation office
without being cited for violating probation conditions.

--The number of probation meetings scheduled, missed, and rescheduled.

--The number of months for which there were written monthly reports for each
probationer.

--Whether they underwent alcohol or drug testing.

--The number of times they were tested. 

None of these comparisons were significant or even showed a tendency toward
significance; thus, there is no reason to conclude that probation officers treated
the two groups differently.

The probation office also might not have sufficiently monitored the attendance
of the experimental group. If batterers were not sufficiently sanctioned for failing
to attend treatment, this experiment would not offer a true test of the efficacy of
court-mandated counseling. This possibility was investigated by looking at
treatment attendance history. Of the 79 men who missed BIP sessions without
making them up, 70 (89 percent) were cited for violating probation conditions
on one or more occasions. Of the nine (11 percent) who missed BIP sessions
and were not cited for violating probation conditions, four had missed only one
session and one had missed only two. These results indicate that the probation
office adequately monitored attendance and sanctioned batterers for not
attending treatment. 

Random assignment ensured that the experimental and control groups were
comparable before treatment. There is no reason to believe that the two groups
did not receive the same amount and kind of monitoring, supervision, and
treatment throughout the test period, with the single exception that the
experimental group was mandated to attend BIP counseling sessions and the
control group was not. 

Findings 

Offender attitudes 

Offender surveys compared men in the experimental and control groups at time
of adjudication, at least 6 months later, and for the change between the two
times. By the time of their second interview, 30 percent of the experimental
group had concluded their counseling program. More important, the sample as
a whole had completed an average of 22 of the 26 mandated counseling
sessions (approximately 85 percent of the intended "dosage" of counseling). 

Approximately half of the men viewed battering as acceptable in certain
situations. No differences were found between the experimental and control
groups in the first or second surveys or over time. There was no difference
between groups initially or over time in their views of the proper roles of
women, whether battering should be considered a crime, or whether the State
had a right to intervene. Both groups also reported the same likelihood of
beating their partners again. 

The only change noted in all of these comparisons was a small but significant
change in men's views of their partners' responsibility for the offense that led
them to court. Over time, those in the control group viewed their partners as
increasingly responsible. In contrast, in the 6 months after adjudication, those in
the experimental group saw the woman as slightly less responsible. Even so,
however, the men in the experimental group still viewed their partners as
"somewhat" to "equally" responsible for the incident. 

Several studies indicate that batterers hold more traditional views than
nonbatterers about women and their proper roles. BIPs are based on the
premise that teaching men that it is wrong to exert verbal, physical, or sexual
control over their partners will lead to changes in their beliefs that will ultimately
produce changes in their behavior. The results of these analyses seem to
indicate, however, that men directed by courts into BIPs, as compared to men
in the control group, did not change their beliefs about the legitimacy of
battering, their responsibility for these incidents, and the proper roles for
women.

Victim attitudes 

Victim interviews clearly indicated that the vast majority of women viewed
battering as inappropriate in virtually all contexts. Not surprisingly, this runs
counter to what most of the men reported. This held true for victims whose
partners were in both groups and did not change over time. Victims reported a
more liberal view of women's roles than their partners did. The experimental
and control groups showed no differences in women's attitudes about the
appropriate role for women, nor did these views change significantly over time.

Victims in both the experimental and control groups shared the same
perceptions over time of whether the offense that brought them to court should
be viewed as a crime. About 57 percent of the women, compared with 26
percent of the men, believed the offense should be viewed as a crime.

Victims rated themselves as not at all to somewhat responsible for the battering,
whereas men rated the women as almost equally responsible. Again, there were
no significant differences between the experimental and control groups in
women's perceptions of responsibility.

Finally, victims in the experimental and control groups showed no significant
differences in their perceptions of the likelihood that their partner would hit them
again. This was the case in both the first and second surveys and over time.
Women saw such an event as more likely than the men did (20 percent versus
5 percent).

Offender self-reported likelihood to engage in abuse 

Thirty percent of the men reported taking what the CTS2 defines as a minor
abusive action (including grabbing and slapping) against their partners within 6
months after adjudication. Thirty-two percent of the women reported such an
incident within the same period. Eight percent of the men reported engaging in
more severe physical abuse (using a knife or gun, choking, or beating up their
partner), compared with 14 percent of the women who reported being victims
of such abuse.

As exhibit 1 indicates, no differences were found between groups initially or
over time in men's self-reported likelihood to engage in any of the activities
listed on the CTS2 (negotiation, psychological coercion, physical abuse, sexual
coercion, and injury). A regression analysis was performed to determine
whether assignment to treatment, treatment received (number of treatment
classes attended), or stake-in-conformity variables (e.g., marital status,
residential stability, and employment) could account for any differences in men's
self-reported use of severe physical violence. Consistent with the analysis of
attitudes and beliefs presented above, the results indicated that neither
assignment to a BIP nor attending the classes was significant in explaining
severe physical violence. Instead, stake-in-conformity variables were important
in accounting for this variation. Younger men with no stable residence were
significantly more likely to report committing acts of severe physical violence
against their partners than their older, more residentially stable counterparts.

Victim reports of their partners' likelihood to engage in abuse 

As exhibit 1 indicates, no difference was found between groups or over time in
women's reports of their partners' likelihood to engage in any of the activities
listed on the CTS2. Fourteen percent of the women reported that an act of
severe physical violence occurred during the followup period. Stake-in-
conformity variables best predicted repeated battering. Offenders' age and
marital status were found to be significant, while offenders' employment, though
not significant, demonstrated a strong tendency to relate to victims' reports of
severe physical violence. Women involved with, but not married to, younger
jobless men were more likely to report incidents of severe physical violence.

----------------------------

Exhibit 1. Revised Conflict Tactics Scale: Average score on scale by survey

----------------------------

Official measures--violations of probation 

Comparisons between the experimental and control groups would be unfair if
one group could be cited for violations of probation (VOP) for reasons that did
not apply to the other group. Men in the experimental group could be held in
violation for failing to attend treatment, a probation condition that did not apply
to those in the control group. Analysis indicated, however, that although
probationers may have had their probations revoked for failing to attend
treatment, in all cases but one, this was only one of several reasons listed in the
revocation. It does not seem that men were found to be in violation of
probation exclusively for failing to attend domestic violence classes. 

Forty-eight percent of the experimental group and 45 percent of the men in the
control group were cited for VOPs at least once during their year on probation.
This difference was not significant. Another regression analysis was performed
to determine whether treatment assigned, treatment received, or stake-in-
conformity variables could account for the variation. Other things being equal,
those assigned to the experimental group were 2.8 times more likely to be cited
for VOPs than those in the control group. The more classes a man attended,
the less likely he was to be cited for VOPs. That attendance of domestic
violence classes was mandatory, however, somewhat offset their estimated
benefit.

The importance of stake-in-conformity variables in predicting successful
completion of probation is clear. The number of months employed best predicts
VOP. Residential stability, age, and marital status also are significantly related
to VOP. A man who moves is more likely to be cited for probation violations,
as are younger jobless men. Married men are less likely to be cited for
probation violations. This increase in likelihood of violation does not seem to be
due to increased monitoring; no significant differences were found between
groups in the way the probation office monitored batterers on probation.

Official measures--rearrests 

Twenty-four percent of men in both the experimental and control groups were
rearrested at least once during their year on probation. Regression analysis was
performed to determine whether treatment assigned, treatment received, or
stake-in-conformity variables were significant in predicting rearrest. Assignment
to the experimental group was not significantly related to likelihood of being
rearrested, but attending domestic violence classes and the interaction between
group assignment and treatment received were significant in predicting
rearrests, as were employment and age. Employment was the most important
factor accounting for variation in rearrests. These findings lead to two primary
conclusions. First, batterers who are assigned to treatment and fail to attend
most or all of the sessions are more likely to be rearrested than similarly situated
men who are not ordered to attend counseling. Second, lack of steady
employment is more important than nonattendance in predicting rearrest.

Attending domestic violence classes can significantly reduce the likelihood of
rearrest both for those assigned to the BIP and for those placed into the control
group. When comparing similarly situated men (in terms of marital status,
employment, residential stability, and age), however, those in the control group
almost always fared better than those in the experimental group on rearrest.

Design limitations 

The controversy surrounding the Broward experiment led to low victim
response rates, high staff turnover, delays, and other problems. The low victim
response rate was a special concern because research consistently indicates
that victims provide the best information on continuing abuse.[7] Despite these
concerns, the fact that this study collected information from multiple sources
(men's self-reports, victims' reports, and official measures) that all indicated
similar conclusions bolstered researchers' confidence in the results from each
measure. 

This experiment provided a valid and rigorous test of the effectiveness of court-
mandated counseling as carried out in Broward County that ought to be
performed in other jurisdictions. The authors have been candid in disclosing the
problems involved in conducting this study[8] in the hope that others will learn
from their mistakes and build better and stronger experiments.

Policy implications 

The results of this study show that counseling had no clear and demonstrable
effect on offenders' attitudes, beliefs, or behavior. Evidence of severe physical
abuse still existed, even at 6 and 12 months after sentencing.

Official reports provided some evidence that men assigned to the counseling
programs were more likely to be rearrested than those in the control group
unless they attended all of the court-mandated counseling sessions. Some may
say that this proves that every legal means must be used to get batterers to
attend treatment. Even those men who attended all their sessions, however,
were only slightly less likely to be rearrested than similarly situated men in the
control group who attended no sessions. When they did not attend all the
sessions, they were more likely to be rearrested than their counterparts in the
control group. 

The charge to "throw the book" at the man who does not attend all of his
treatment sessions seems to miss the point. In this jurisdiction, unlike those
observed by Adele Harrell[9] and Sally Palmer and her colleagues,[10] men
were monitored and sanctioned. Although approximately 33 percent of the men
failed to attend treatment, all of them were cited for violating one or more
probation conditions and 71 percent of them were cited for failing to attend
counseling. The probation office did its job; probation was revoked when men
did not complete the batterers' program. Nevertheless, some men completed
the treatment and others dropped out. Finally, this study indicated the 
importance of stake-in-conformity variables in predicting rearrest among men
convicted of misdemeanor domestic violence.

Notes

1. The terms "convicted" or "adjudicated" have legal significance. This study
included men who had either (1) pled guilty or no contest to domestic violence
battery charges or who were found guilty after trial and were placed on
probation, or (2) been placed on probation, whether adjudicated guilty or not,
for the offense of domestic violence battery, or (3) been found guilty of or
placed on probation for crimes of domestic violence. The vast majority of
defendants (96 percent) pled no contest to the charges. Throughout this report,
this entire group of men is referred to as those adjudicated or convicted of a
misdemeanor domestic violence charge.

2. t = 1.42, p > .05.

3. t = 2.81, p < .01.

4. Edleson, J., Z. Eisikovits, and E. Guttmann, "Men Who Batter Women: A
Critical Review of the Evidence," Journal of Family Issues 6(2)(1985): 229-
247; Hamberger, L.K., and J. Hastings, "Recidivism Following Spouse Abuse
Abatement Counseling: Treatment Program Implications," Violence and Victims
5(3)(1990): 157-170; Hotaling, G., and D. Sugarman, "An Analysis of Risk
Markers in Husband to Wife Violence: The Current State of Knowledge,"
Violence and Victims 1(2)(1986): 101-124. 

5. Hirschel, J.D., and I. Hutchinson, "Female Spouse Abuse and the Police
Response: The Charlotte, North Carolina Experiment," Journal of Criminal Law
and Criminology 83(1)(1992): 73-119; Palmer, S., R. Brown, and M. Barrera,
"Group Treatment Program for Abusive Husbands: Long-Term Evaluation,"
American Journal of Orthopsychiatric Association 62(2)(1992): 276-283;
Tolman, R., and A. Weisz, "Coordinated Community Intervention for Domestic
Violence: The Effects of Arrest and Prosecution on Recidivism of Woman
Abuse Perpetrators," Crime and Delinquency 41(4)(1995): 481-495.

6. Petersilia, J., "Implementing Randomized Experiments: Lessons from BJA's
Intensive Supervision Project," Evaluation Review 13(5)(1989): 435-458;
Babbie, E., The Practice of Social Research, Belmont, CA: Wadsworth, 1998.

7. Arias, I., and S. Beach, "Validity of Self-Reports of Marital Violence,"
Journal of Family Violence 2(2)(1987): 139-149; Edleson, J., and M. Brygger,
"Gender Differences in Reporting of Battering Incidences," in Understanding
Partner Violence: Prevalence, Causes, Consequences, and Solutions, ed. S.
Stith and M. Straus, Minneapolis, MN: National Council of Family Relations,
1995: 45-50.

8. Feder, L., and D.R. Forde, "A Test of the Efficacy of Court-Mandated
Counseling for Domestic Vio-lence Offenders: The Broward Experiment," Final
report for National Institute of Justice, grant number 96-WT-NX-0008,
Washington, DC: National Institute of Justice, 2000. NCJRS. NCJ 184752.

9. Harrell, A., Evaluation of Court-Ordered Treatment for Domestic Violence
Offenders: Final Report, Washington, DC: Institute for Social Analysis, 1991.

10. Palmer, S., R. Brown, and M. Barrera, "Group Treatment Program for
Abusive Husbands: Long-Term Evaluation."

----------------------------

About the Authors

Lynette Feder and David R. Forde are associate professors of criminology and
criminal justice at the University of Memphis.

----------------------------

The Brooklyn Experiment

Robert C. Davis, Christopher D. Maxwell, and Bruce G. Taylor

Differences among the studies

Voluntary versus involuntary treatment

Unlike in the Broward County experiment and a similar study by Sally Palmer
and her colleagues,[1] batterers in this study were mandated to treatment by
judicial order rather than probation departments. This difference has
implications for the kinds of batterers studied. The Palmer and Broward County
studies included all or most batterers sentenced to probation, whether or not
they were willing to undergo treatment. In this study, batterers were eligible for
inclusion only if all parties to the case (prosecution, defense, and judge) agreed
treatment was appropriate. In several cases, such agreement could not be
reached, usually because the defense refused to agree to treatment. Thus, the
results of this study are harder to generalize than the results of the Palmer and
Broward County experiments. On the other hand, because all batterers in this
study's sample agreed to treatment, the study presumably did not include
unmotivated batterers.[2] This point is crucial because it has often been argued
that treatment cannot be expected to work for individuals who are compelled to
attend against their will.[3]

Control group differences 

This difference in how batterers were mandated to treatment also has
implications for comparison groups. The Palmer and Broward County studies
compared treatment with no treatment. In contrast, this study compares
batterers assigned to treatment with batterers assigned to a community service
program irrelevant to the problem of violence. The comparison between
batterer treatment and an irrelevant treatment is appropriate for judicially
mandated treatment referrals (since all convicted batterers must receive some
sentence), just as the comparison between treatment and no treatment is
appropriate for probation-mandated referrals.

Differences in length of treatment

As described in detail below, the treatment sample in this study was split into
two subsamples. Although all batterers randomly assigned to treatment were
ordered to attend 39 hours of group treatment based on the Duluth model,
some attended 1.5-hour weekly sessions for 26 weeks, while others attended
2.5-hour sessions twice a week for 8 weeks. The former treatment model
maximized the time batterers stayed in treatment; the latter reduced the chances
that batterers' initial motivation to seek treatment would flag over time.

Methodology 

In this study, which was conducted using a true experimental design, 376
criminal court defendants were mandated to attend a 39-hour batterer
treatment program or complete 39 hours of community service. Random
assignment was made at sentencing after all parties (judge, prosecutor, and
defense) had agreed to accept a random assignment to batterer treatment.

Batterers and victims were interviewed about new violence on three occasions:
at sentencing, 6 months later, and 12 months later. Official data on new
complaints to the police and new arrests were gathered at 6 and 12 months
after sentencing.

Cases included 

The sampling frame consisted of spousal assault cases in Kings County
(Brooklyn, New York) Criminal Court. All parties agreed in principle to accept
batterer treatment if the defendant was accepted by the Alternatives to
Violence (ATV) program. Selection began on February 19, 1995, and ran
through March 1, 1996. During that time, 376 cases were taken into the
sample, a small percentage of the cases adjudicated during the selection period.

In 64 percent of the cases in the study, defendants were charged with third-
degree assault (a class A misdemeanor). Another 19 percent were charged
with felonious assault (although they pleaded to misdemeanor charges). The
remaining 17 percent were charged with violating restraining orders, menacing,
harassment, and other offenses. Defendants most commonly pleaded guilty and
were then given a conditional discharge that placed them under court control for
1 year. Twenty-three percent of the cases were adjourned in contemplation of
dismissal (cases would be dismissed and records expunged if defendants
avoided arrest and adhered to judicial conditions for 6 months). 

The ATV curriculum 

ATV was based on the Duluth model, which assumes that domestic violence is
a byproduct of male and female roles that result in an imbalance of power. The
curriculum included defining domestic violence, understanding the historical and
cultural aspects of domestic abuse, and reviewing criminal and legal issues.
Through a combination of instruction and discussion, participants were
encouraged to take responsibility for their anger, actions, and reactions.
Sessions were conducted in English and Spanish by two leaders, one male and
one female.

Selection difficulties 

At the time the experiment began, ATV had just expanded the number of
required hours from 1.5 hours once a week for 12 weeks to 1.5 hours once a
week for 26 weeks. This was done to conform with New York State guidelines
and national trends. The longer program, however, drew objections from Legal
Aid Society attorneys,[4] who defended most indigent defendants in King
County Criminal Court. The attorneys began to advise their clients against
involvement in the program. Selection slowed to a standstill. At a meeting with
the attorneys, it became clear that they objected to the increased time their
clients would be under court control and the higher session fees they would
have to pay over the course of 26 sessions.

If selection was to be completed on time, these objections would have to be
accommodated. ATV administrators designed a new 8-week format, through
which participants could complete the same 39 hours of treatment in twice-
weekly, 2.5-hour sessions with lower fees per session. The new format began
to be offered after the first 129 participants had been assigned to 26-week
groups. From August 15, 1995, until selection was completed, defendants were
offered a choice between 8-week and 26-week formats. Once the 8-week
groups became available, none of the final 61 participants chose the 26-week
option.

Control group 

Defendants selected by lottery to the control group were mandated by judges
to participate in 39 hours of community service, typically over 2 weeks. For
offenders with jobs, flexible hours were arranged over a 2-month period so
they could continue their jobs. Participants renovated housing units, cleared
vacant lots to make way for community gardens, painted senior-citizen centers,
and cleaned up playgrounds--all activities that would be expected to have little
effect on abusive behavior. During their service, participants were educated
about drugs and HIV. Interested individuals were referred to drug, HIV, or
employment counseling programs.

Participants in both batterer treatment and community service programs were
expelled if a pattern of nonattendance developed (for ATV, three misses
constituted grounds for expulsion). For men assigned to batterer treatment,
such cases were referred to the District Attorney's Office. At the prosecutor's
discretion, delinquent cases could be returned to the court calendar and new
sentences imposed. In practice, however, few cases were restored to the
calendar because the period of court supervision typically was drawing to a
close by the time a clear pattern of noncompliance was established and a
request for restoration completed.

Followup on delinquents was more reliable for the community service group.
The organization running the program had the authority to place delinquent
cases on the court calendar itself, rather than recommending that the prosecutor
do so. If the court issued an arrest warrant for noncompliance, the community
service program had enforcement staff to execute the warrants.

Assignment process 

Cases were drawn from three of eight postarraignment courts in Kings County
Criminal Court. Two of the courts were specialized domestic violence courts.
The third was a jury trial court where domestic violence cases were transferred
if a disposition could not be negotiated. When judge, prosecutor, and defense
reached agreement on batterer treatment as an appropriate disposition,
defendants were screened by ATV for eligibility and assigned by lottery to
batterer treatment or community service.

After assignment to treatment, the defendant was accompanied back to the
courtroom and the prosecutor was told of the lottery assignment. The
prosecutor told the judge, who then accepted a disposition consistent with the
assignment. In 28 percent of the cases in which batterers were randomly
assigned to the control program, judges mandated that batterers receive
treatment instead. Judges overrode no cases randomly assigned to the ATV
program.

Followup measures 

The most important test of effectiveness for any batterer treatment program is
whether it reduces violence. Therefore, this study included both short-term (6
months after sentencing) and intermediate-term (12 months after sentencing)
followup on treatment outcomes. Short-term outcomes are important to assess
because any treatment effects may be short-lived. The more time passes after a
domestic complaint to police, the less likely future violence becomes.[5] Any
early differences in violence due to treatment might disappear as violence in the
control group declines over time. Longer term followup is important to
determine whether short-term treatment effects continue after batterers are no
longer attending treatment or under court control. 

The study included two measures of new batterer-victim violence: new incidents
reported to criminal justice authorities involving the same victim and victim
reports of new incidents to research interviewers.[6] Violence indicators do not
always behave in similar ways,[7] so it is important to capture more than one.
Both measures were captured at 6 and 12 months after sentencing. Crime
report and arrest data were obtained from official records. Victim self-reports
were obtained primarily through telephone interviews.

In addition to capturing information on new violent acts, the interviews assessed
attitudes and cognitive behaviors among batterers and victims. Conflict
resolution skills and attitudes toward violence in the family were measured for
both the treatment and control groups. Batterers and victims were tested to see
whether they believed they could influence events or thought things simply
happened to them.[8] It seemed plausible to assume that, if batterer treatment
succeeded in making batterers take more responsibility for their actions, their
test results would show more control over those actions. Victims were tested to
see how well they were adjusting psychologically. If post-treatment tests
showed that victims had higher self-esteem and a greater sense of well-being, it
could be a sign that treatment had produced a change in the way batterers
treated their partners. 

Interview methodology 

Researchers tried to interview defendants and victims on three occasions: at
selection (court disposition), 6 months later, and 12 months later. Batterers
were interviewed in person in the court building just before they were assigned
to batterer treatment or community service. Subsequent interviews with
batterers and all interviews with victims were conducted primarily by telephone.
Because it was believed victims would be more truthful than batterers in talking
about new violence, special efforts were put into interviewing victims. When
telephone attempts failed, teams of interviewers were sent to victims' homes. If
these attempts also failed, letters were mailed offering first $25 and then $50 for
completion of an interview. In the third set of victim interviews, 70 difficult
cases were turned over to a licensed private investigator. The investigator used
databases to track victims who had moved and provided the research team
with current addresses. He did not confront victims or their acquaintances. The
research team tried to interview women he located by telephone. Ultimately,
however, this additional tracking led to no more interviews.

Completion rates 

The completion rate for victim interviews was 50 percent for the first interview,
46 percent for the second, and 50 percent for the third. First interviews with
batterers were obtained with 95 percent of the sample when defendants were
present in court for selection to the treatment program. For the second and
third interviews, completion rates were 40 and 24 percent. Completion rates
were substantially higher for victim interviews because researchers went to
extra lengths (incentives, in-person visits) to obtain them.

Findings 

Treatment effects on behavior 

Initial analyses showed that batterers assigned to treatment were less likely to
be accused of battering the same victim again than batterers assigned to
community service. This difference was most pronounced at 6 months after
group assignment, but persisted for a full year (see exhibit 1).

----------------------------

Exhibit 1. Prevalence of criminal justice incidents involving the same victim and
perpetrator

----------------------------

Batterers were far more likely to complete the shorter course of treatment.
Roughly similar proportions of batterers began treatment in both groups (77
percent of those assigned to the 8-week group and 71 percent of those
assigned to the 26-week group attended at least one class), but 67 percent of
the men assigned to the 8-week group graduated, compared with just 27
percent of those assigned to the 26-week group (see exhibit 2).[9] 

----------------------------

Exhibit 2. Attendance in 8- versus 26-week batterers' group

----------------------------

Researchers expected that men assigned to the 8-week group would have a
lower reoffense rate than men assigned to the 26-week group because a larger
proportion of them completed the program. Only the 26-week group, however,
had significantly fewer criminal complaints than the control group at 6 and 12
months after sentencing: The 8-week group and the control group were virtually
indistinguishable (see exhibit 3).

----------------------------

Exhibit 3. Prevalence of criminal justice incidents involving same victim

----------------------------

Victim reports of violence also showed that men who attended 26 weeks of
treatment committed fewer new violent acts than those who attended 8 weeks
or no treatment. These differences, however, were not statistically significant
(see exhibit 4).

----------------------------

Exhibit 4. Prevalence of incidents reported by victims to research interviewers

----------------------------

Even when defendants' age, ethnicity, marital status, employment status, and
arrest history were factored in, the 26-week group had fewer complaints of
new crimes against their battering victims than the 8-week and control groups.
In addition, reports of criminal complaints showed that those in the 26-week
group went significantly longer before battering again.[10]

Treatment effects on attitudes

Researchers also looked at measures of cognitive change in batterers, including
conflict resolution skills, beliefs about domestic violence, and internal versus
external control. As shown in exhibit 5, there is no basis for claiming that
treatment changed batterers' attitudes or ways of dealing with conflict.[11]

----------------------------

Exhibit 5. Means and standard deviations for psychosocial outcomes

----------------------------

Design limitations

This study illustrates the difficulties that can be encountered in carrying out an
experiment with a true experimental design. Substantial concessions had to be
made to court officials to gain their cooperation. Judges were allowed to
override assignments to the control group. If override cases had been included
in the control group, the tests of treatment effects would have been made more
conservative. (Nonetheless, large treatment effects were still found.) Also, the
research team had to offer a treatment alternative that was more palatable to
the defense than the lengthy and costly version it started with. This proved to be
fortuitous because substantial differences in outcomes were found between men
assigned to the 8-week and 26-week groups.

Policy implications 

Does batterer intervention modify attitudes and behavior in a relatively lasting
way, or does it simply suppress violent behavior for the duration of treatment?
The results of this study do not support the view that treatment leads to lasting
changes in behavior. Were that true, the men in the 8-week group (who finished
their treatment long before the followup period expired) ought to have been no
more violent than their counterparts in the 26-week program (who were in
treatment for most of the followup period). That is not what this study showed.
Nor was any evidence found that treatment altered batterers' attitudes toward
spouse abuse, which further suggests that treatment brought about no
permanent changes. 

The results of this study thus support the view that batterer intervention merely
suppresses violent behavior for the duration of treatment. Since, however, the
study was not designed to test the validity of various treatment models, the
results cannot be seen as conclusive. Moreover, they are at odds with results of
other studies that found no difference in reoffense rates according to length of
treatment.[12] Many batterer programs are adopting longer treatment models, 
but there is substantial pressure from the defense bar and economics to keep
time in treatment to a minimum. Thus, the question of whether treatment works
only as long as men attend counseling is crucial to intelligent policy formulation.

Notes

1. Palmer, S.E., R.A. Brown, and M.E. Barrera, "Group Treatment Program
for Abusive Husbands: Long-Term Evaluation," American Journal of
Orthopsychiatry 62(2)(1992): 276-283; Feder, L., and D.R. Forde, "A Test of
the Efficacy of Court-Mandated Counseling for Domestic Violence Offenders:
The Broward Experiment," Final Report for the National Institute of Justice,
grant number 96-WT-NX-008, Washington, DC: National Institute of Justice,
2000. NCJRS. NCJ 184752.

2. Of course, participants did not seek treatment of their own volition; they
were mandated by the court to do so. Still, it is common knowledge in
Brooklyn Criminal Court that misdemeanor batterer defendants are not facing
jail time. Participants in treatment certainly knew from counsel that they were
choosing the batterer program not as the only way to keep out of jail but over
another alternative to incarceration. 

3. Rosenfeld, B.D., "Court-Ordered Treatment of Spouse Abuse," Clinical
Psychology Review 12(1992): 205-226.

4. At the time the change was made, Legal Aid administrators had pledged
cooperation and had made good on that pledge.

5. Davis, R.C., and B.G. Taylor, "A Proactive Response to Family Violence:
The Results of a Randomized Experiment," Criminology 35(2)(1997): 307-
333.

6. These indicators are commonly used in studies tracking households where
domestic violence occurs, such as NIJ's Spouse Assault Replication Program
research. See, for example, Fagan, J., J. Garner, and C.D. Maxwell, "Reducing
Injuries to Women in Domestic Assaults," Final Report, Washington DC: U.S.
Department of Health and Human Services, Centers for Disease Control and
Prevention, National Center for Injury Control and Prevention, 1997.

7. See, for example, Davis, R.C. and B.G. Taylor, "A Proactive Response to
Family Violence: The Results of a Randomized Experiment."

8. Cognitive measures included the Inventory of Beliefs About Wife Beating
Scale, Harrell's measure of Conflict Resolution Skills, and a shortened (12-
item) version of the Nowicki-Strickland Internal-External Control Scale.
Saunders, D.G., A.B. Lynch, M. Grayson, and D. Linz, "Inventory of Beliefs
about Wife Beating: The Construction and Initial Validation of a Measure of
Beliefs and Attitudes," Violence and Victims (2) (1987): 39-55; Harrell, A.,
Evaluation of Court-Ordered Treatment for Domestic Violence Offencers,
Final Report to the State Justice Institute, Washington, DC: The Urban
Institute, 1991; Nowicki, S., and M.P. Duke, "A Locus of Control Scale for
Non-College as Well as College Adults," Journal of Personality Assessment 38
(1974): 136-137.

9. Chi-square (1) = 27.72, p < .001.

10. See Davis, R.C., B.G. Taylor, and C.D. Maxwell, "Does Batterer
Treatment Reduce Violence? A Randomized Experiment in Brooklyn--
Executive Summary Included," Final Report for National Institute of Justice,
grant 94-IJ-CX-0047. Washington DC: National Institute of Justice, 2000.
NCJRS. NCJ 180772.

11. For each scale, means across the three treatment groups are remarkably
similar, and none of the tests shown in exhibit 5 comes close to statistical
significance. Limitations in the scales and the data, however, do not permit a
complete test of this hypothesis. For a discussion of these limitations, the reader
is referred to the full report. See Davis, R.C., B.G. Taylor, and C.D. Maxwell,
"Does Batterer Treatment Reduce Violence? A Randomized Experiment in
Brooklyn--Executive Summary Included."

12. Edleson, J.L., and M. Syers, "Relative Effectiveness of Group Treatments
for Men Who Batter," Social Work Research and Abstracts 26(2) (1989): 10-
17; Gondolf, E., Multi-Site Evaluation of Batterer Intervention Systems: A
Summary of Preliminary Findings (Indiana, PA: Mid-Atlantic Addiction
Training Institute, 1997).

----------------------------

About the Authors

Robert C. Davis is senior research associate at the Vera Institute of Justice in
New York City. Christopher D. Maxwell is an assistant professor at Michigan
State University's School of Criminal Justice. Bruce G. Taylor is deputy
director of the Arrestee Drug Abuse Monitoring (ADAM) Program in NIJ's
Office of Research and Evaluation. At the time this research was conducted, all
three were affiliated with Victim Services Research in New York City.

----------------------------

Analyzing the Studies

Shelly Jackson

There are two possible explanations for the finding that the batterer intervention
programs (BIPs) in Brooklyn and Broward County had little or no effect on
their clients. One is that the evaluations were methodologically flawed; the other
is that design of the programs themselves may be flawed. These two
explanations are not necessarily mutually exclusive.

Methodological issues 

Response and attrition rates 

Both programs had low response rates and high dropout rates[1]--
characteristics that can lead to overly positive estimates of program effects.
Those who continue to batter are not likely to participate in intervention
programs; if they participate in the beginning, they are likely to drop out. Hence,
drawing on a sample of "available" participants is problematic. It is unclear
whether the effect found in the Brooklyn evaluation is the result of attrition or a
true program or monitoring effect.

Valid and reliable outcome measures

Another problem that complicates BIP evaluations is the lack of valid and
reliable measures of batterer behavior and attitudes. The revised Conflict
Tactics Scale (CTS2) is often used, but this instrument was not designed to be
repeated over time. It therefore may be an inappropriate tool for "before" and
"after" measurements. Because no scientifically agreed-upon outcome measures
exist specifically for this purpose, at a minimum, evaluations should include
multiple outcome instruments that use multiple sources to validate results.[2]

Multiple sources of data 

Using more than one source of data to measure the impact of a program
increases the validity of the findings. Both studies used multiple data sources
(batterer self-reports, victim reports, and official records). In Brooklyn, the
researchers initially found differences only in batterers' reports of battering
again. After statistically controlling for several variables, however, victim
reports and official records replicated those reports. Although official records
commonly are used to validate batterer and victim reports, the use of official
rearrest records remains problematic. Rearrests capture only those violations
that reach the authorities, whereas there is evidence that batterers often avoid
rearrest by using psychological and verbal abuse.[3] Probation violations,
another form of official records used in the Broward County study, are likely to
be overly broad and may not necessarily indicate a battering incident.

Definition of success 

A related issue is whether success should be defined as complete cessation of
violence or merely as a reduction in violence.[4] The studies in this report
consider a reduction in violence to be a success. This choice is based on the
premise that it may be unrealistic to expect batterers to change an established
pattern of behavior dramatically after a relatively short intervention. Yet even a
statistically significant reduction in violence may be of little practical significance
to a battered woman.[5] 

Problems with random assignment

Random assignment to treatment and control groups is critical in an
experimental study. It makes certain that preexisting differences between the
groups are evenly distributed and allows researchers to conclude that the
program is responsible for any subsequent differences. Ensuring that assignment
is truly random, however, is often difficult. 

In Broward County, changes to random assignment were minimal. The
Brooklyn study experienced considerably more difficulty in this respect. After
agreement of the courts had already been obtained, the length of the
intervention had to be increased from 12 to 26 weeks to comply with New
York State guidelines. This change apparently concerned many defense
attorneys. After the evaluation had already begun (129 subjects had been
enrolled in the 26-week evaluation), the Legal Aid Society, which represented
many of the defendants, began to advise its clients not to participate. Research
assignment to the intervention ceased. To obtain the necessary number of
clients in the research sample, a compromise was reached: An 8-week program
was offered that contained the same content as the 26-week program. All
batterers assigned to treatment thereafter chose the 8-week program. Allowing
batterers to choose between treatment programs poses a problem because
there can be no assurance that the group of defendants who chose the 8-week
program was not systematically different from the group assigned to the 26-
week program. Thus, effects cannot be attributed exclusively to the program
(i.e., alternative explanations are plausible). Moreover, in 28 percent of the
control cases in the Brooklyn evaluation, judges overrode the random
assignment and mandated that batterers receive treatment (these cases were
appropriately included in the analyses). In addition, some participants in both
versions of the program were expelled for repeated nonattendance. 

These compromises in random assignment dilute the potential impact of the
intervention and seriously limit the ability to generalize about the evaluation
results. Although the integrity of the Broward County experiment's random
assignment was better, even there, judges overrode the random assignment in
3.5 percent of the cases. Each compromise of the random assignment
decreases confidence in the results. 

Attendance problems 

Many batterers did not attend some or all of their treatment sessions; in
Broward County, 13 percent of the treatment group attended no classes at all
and 3 percent of the control group attended some classes. This raises another
serious methodological issue: What is being measured, treatment assignment or
treatment effects? If batterers can choose to complete or drop out of treatment,
the strength of the experimental design is compromised. Thus, it can be argued
that these evaluations were examining the effects of assignment to a treatment
group as much as the effects of the intervention itself, because not everyone in
the treatment group received the entire intervention. Feder and Forde
statistically tested for this possibility and found no treatment effects.
Nonetheless, this is a common problem BIP evaluations have to face.

Time of offense 

Until recently, evaluation researchers have not considered the time of offense
when measuring outcomes. Yet this is important to know. If the offender batters
again during the first week of treatment, it cannot be said that the program had
no effect; rather, the program had no opportunity to affect the batterer. In
contrast, if the offender batters again near the end of the intervention or later,
that may better indicate program effectiveness. Davis, Taylor, and Maxwell
analyzed the Brooklyn data to control for this possibility. Feder and Forde,
however, did not consider time of offense in Broward County, which makes it
more difficult to interpret their results. 

Program design issues 

In addition to these methodological problems, problems with the design of BIPs
themselves could limit their effectiveness.

Faithfulness to program model 

Program models are sometimes not carried out completely. Testing how faithful
programs are to the models on which they are based requires process
evaluations, which, to date, few evaluations have incorporated. 

Conceptual limitations 

BIP designs also may have conceptual limitations. The Duluth model assumes
that all batterers seek to control their partners. Batterers' motivations for
violence may differ, so the same type of intervention may not work with all
batterers. 

BIPs also may be limited by their lack of cultural specificity.[6] Although
domestic violence occurs in all populations, treatment approaches may need to
be tailored to serve specific populations. It may be unreasonable to expect
Duluth-model interventions based on white feminist theory to work effectively
with minority populations. Not everyone agrees with this proposition, however.
The House of Ruth in Baltimore, Maryland, deliberately created an ethnically
integrated group treatment setting based on the Duluth model to stress that
domestic violence has nothing to do with race or socioeconomic status. NIJ has
recently funded an experimental evaluation to examine whether a batterer
intervention model designed specifically for black men is more effective for
them than an integrated model. 

Accommodating special needs 

Although this is changing, few interventions to date have assessed abusers'
mental health and substance abuse treatment needs. These factors do not
excuse the battering, but they may make interventions less effective. Including
more services, however, may have the unintended effect of increasing the length
of a program, its associated costs, and possibly its dropout rates. It is unclear
which is more effective: keeping program length to a minimum or adding
components (and thereby lengthening the program). These factors deserve
more research.

Willingness to change 

Programs may remain minimally effective until they consider the batterer's
readiness to change. Theories focusing on understanding the stages of personal
change suggest that the batterer will change his behavior only when he is ready
to change.[7] Thus, mandating treatment for batterers who are not ready to
change may be ineffective. BIPs may be effective for batterers who are ready
to change, but batterers who are not yet ready may require other interventions.

Policy implications and future directions

Although interventions are proliferating, there is little evidence that they work.
This raises important policy questions: 

--Do batterer intervention programs waste valuable resources? 

 --Do they create a false sense of security in women who are led to believe that
their batterer will reform? 

--Is it prudent to mandate batterers to BIPs when there is little evidence that
they work? 

Unfortunately, the latest contributions to this growing literature cannot answer
these questions and raise additional issues. Although the Brooklyn study found
some differences between those who completed the 8-week program, those
who completed the 26-week program, and those who attended no program, it
remains unclear whether these differences were due to a program effect or a
monitoring effect. Further research is needed to clarify this issue. 

One thing is clear: Rigorous evaluations are essential to answering the pressing
questions about what works and using that knowledge to influence public
policy. The stakes for women's safety are simply too high to rely heavily on the
use of BIPs without stronger empirical evidence that they work. 

Are these evaluations accurate in saying that BIPs are not very effective at
changing batterers' behaviors and attitudes, or are the small program effects
merely the result of methodological shortcomings in the evaluations themselves
that mask program effectiveness? Both issues may need to be addressed. To
enhance our knowledge, both BIPs and evaluations likely will have to be
improved. 

Improving program evaluations 

Over the years, the quality of BIP evaluation has improved steadily,[8] but
several barriers remain to be addressed. Although a variety of designs have
been used to study BIPs (e.g., pre-post, quasi-experimental, and experimental
designs), most researchers still consider the experiment to be the best
evaluation method. Experimental designs are difficult to carry out in court
settings; the pressures involved reduce many experimental evaluations to quasi-
experiments that cannot deliver the necessary knowledge. Researchers,
practitioners, and policymakers must work together to develop strategies that
enable experimental evaluations to be carried out vigorously. All BIP
evaluations, regardless of design, face difficulties in interviewing batterers and
victims during the followup period. Researchers will need to find innovative
ways to maintain contact with batterers and victims over time.[9] Researchers
also will need to develop reliable and valid outcome measures rather than
relying solely on official records such as rearrests and probation violations to
validate batterer and victim reports. 

Statistical tools can be used to enhance evaluation results once an experimental
evaluation has been completed. One tool is selection modeling,[10] which can
account for nonrandom assignment. The bootstrap method, which provides a
simple means for obtaining an approximate sampling distribution of the statistic
that is conditional on the observed data, is another.[11] Survival or event
history analyses may be useful in accounting for outcomes over time.[12] By
undertaking reviews of several studies, researchers may be able to aggregate
small-scale studies that may have insufficient power to detect differences on
their own.[13]

Improving intervention programs 

In addition to improving the quality of the experimental design and results,
improvements in the concepts underlying the various models of BIPs may be
warranted. New intervention approaches could be developed based on
theories derived from existing research into the causes of battering.[14] Useful
research has been conducted on batterer profiles, and new treatment
approaches are being designed to match those profiles with appropriate
interventions.[15] Although this approach still must be tested, it may prove
more productive than a one-size-fits-all approach. It also may be advantageous
for researchers to draw lessons from other disciplines, such as substance abuse
interventions.

BIPs may be effective only in the context of broader criminal justice
innovations. It may be helpful to see interventions as part of a broader criminal
justice and community response to domestic violence that includes arrest,
restraining orders, intensive monitoring of batterers,[16] and changes to social
norms that may inadvertently tolerate partner violence. If monitoring is in part
responsible for lower reoffense rates, as the Brooklyn experiment suggests,
judicial monitoring may be particularly effective.[17] The Judicial Oversight
Demonstration initiative, a collaboration of NIJ, the Violence Against Women
Office, and three local jurisdictions, is testing this proposition.[18] Other
innovations might include mandatory intervention until a committee determines
that the batterer is no longer a danger to his partner (i.e., indeterminate
probation and intervention), an approach that has been used with sex
offenders.[19] 

Improvements in the ways BIPs are put into practice may also be necessary, as
variations in how programs are carried out may reduce program effectiveness.
Some programs have few sanctions for dropping out, whereas others closely
monitor batterer attendance. This suggests the need to test the effectiveness of
close monitoring and required attendance. Consistent with dose-response
theory,[20] batterers should be exposed to the entire program before outcome
measures are taken. Drug treatment research has shown that length of treatment
(i.e., dosage) influences the outcome.[21] One way to determine whether a
program is being carried out as designed is to conduct process and impact
evaluations at the same time to understand how program implementation affects
the impact evaluation.[22] 

The field of batterer intervention is still in its infancy, and much remains to be
learned. Rather than asking whether BIPs work, a more productive question
may be which programs work best for which batterers under which
circumstances,[23] a decidedly more complex question. If this approach is
adopted, improved theories of batterering will need to precede new responses
that will need to be tested. If differential sentencing is incorporated into the
criminal justice system, procedures will need to be developed to ensure that it is
carried out fairly. As BIPs are a relatively new response to a critical social
problem, it is too early to abandon the concept. It is also too early to believe
that we have all the answers. Research and evaluation supported by NIJ will
continue to add to our growing knowledge of responses to battering, including
batterer intervention programs.

Notes

1. Feder and Forde: Interviews were conducted with 80 percent of defendants
for the initial interview and 50 percent for the second interview; 49 percent of
victims for the initial interview, 30 percent and 22 percent for subsequent
interviews. Davis, Taylor, and Maxwell: Interviews were conducted with 50
percent of the victims at the first interview, 46 percent and 50 percent for
subsequent interviews. Because interviews were conducted in court at intake,
95 percent of batterers were interviewed at adjudication; 40 percent and 24
percent for subsequent interviews. 

2. Heckert D.A., and E.W. Gondolf, "Assessing Patterns of Agreement on
Assault Among Batterer Program Participants and their Partners," paper
presented at the 5th International Family Violence Research Conference at the
University of New Hampshire, Durham, NH, June 29-July 2, 1997. Also see
Burt, M.R., A.V. Harrell, L.C. Newmark, L.Y. Aron, L.K. Jacobs, et al.
Evaluation Guidebook for Projects Funded by S.T.O.P. Formula Grants Under
the Violence Against Women Act, Washington, DC: The Urban Institute, 1997. 

3. Gondolf, E.W., "Patterns of Reassault in Batterer Programs," Violence and
Victims 12(4): 373-87; Harrell, A.V., Evaluation of Court-Ordered Treatment
for Domestic Violence Offenders, Final Report Submitted to the State Justice
Institute. Washington, DC: The Urban Institute, 1991.

4. Edleson, J.L., "Controversy and Change in Batterer's Programs," in Future
Interventions with Battered Women and their Families, ed. J.L. Edleson and
Z.C. Eisikovitz. Thousand Oaks, CA: Sage Publications, 1996.

5. Ibid.

6. Williams, O.J., and R.L. Becker, "Domestic Partner Abuse Treatment
Programs and Cultural Competence: The Results of a National Survey,"
Violence and Victims 9(3)(1994): 292.

7. Daniels, J.W., and C.M. Murphy, "Stages and Processes of Change in
Batterers' Treatment," Cognitive and Behavioral Practice 4 (1997): 123-45;
Fawcett, G., L.L. Heise, L. Espegel, and S. Pick, "Changing Community
Responses to Wife Abuse: A Research and Demonstration Project in Iztacalco,
Mexico," American Psychologist 54(1999): 41-49; Murphy, C.M., and V.A.
Baxter, "Motivating Batterers to Change in the Treatment Context," Journal of
Interpersonal Violence 12(1997): 417-422.

8. Davis, R.C., and B.G. Taylor, "Does Batterer Treatment Reduce Violence?
A Synthesis of the Literature," Women and Criminal Justice 10(2)(1999): 69-
93.

9. Gondolf, E.W., "Batterer Programs: What We Know and Need to Know,"
Journal of Interpersonal Violence 12(1)(1997): 83-98; Sullivan, C.M., M.H.
Rumptz, R. Campbell, K.K. Eby, and W.S. Davidson, "Retaining Participants
in Longitudinal Community Research: A Comprehensive Protocol," Journal of
Applied Behavioral Science 32(3)(1996): 262-76.

10. Gondolf, E.W., and A.S. Jones, "The Program Effect of Batterer Programs
in Three Cities," American Journal of Community Psychology, June 2000,
under review; Rossi, P.H., H.E. Freeman, and M.W. Lipsey, Evaluation: A
Systematic Approach, 6th ed. Thousand Oaks, CA: Sage Publications, 1999.

11. Fagan, J. The Criminalization of Domestic Violence: Promises and Limits.
Research Report. Washington, DC: U.S. Department of Justice, National
Institute of Justice, January 1996. NCJ 157641.

12. Gondolf, E.W., "Batterer Programs: What We Know and Need to Know."

13. Lipsey, M., G. Chapman, and N. Landenberger, "Cognitive-Behavioral
Programs for Offenders: A Synthesis of the Research on their Effectiveness for
Reducing Recidivism," paper presented at the "Systematic Reviews of
Criminological Interventions" conference, Washington, DC, April 2-3, 2001.

14. Healey, K., C. Smith, and C. O'Sullivan, Batterer Intervention: Program
Approaches and Criminal Justice Strategies, Issues and Practices, Washington,
DC: U.S. Department of Justice, National Institute of Justice, February 1998.
NCJ 168638.

For examples of research into the causes of battering, see Moffitt, T.E., and A.
Caspi, Findings About Partner Violence from the Dunedin Multidisciplinary
Health and Development Study, Research in Brief, Washington, DC: U.S.
Department of Justice, National Institute of Justice, July 1999. NCJ 170018.

15. Holtzworth-Munroe, A., and G.L. Stuart, "Typologies of Male Batterers:
Three Subtypes and the Differences Among Them," Psychological Bulletin
116(3)(1994): 476-97; Wexler, D.B., "The Broken Mirror: A Self
Psychological Treatment Perspective for Relationship Violence," Journal of
Psychotherapy, Practice, and Research 8(2)(1999): 129-141.

16. A. Klein, as cited in Healey, K., C. Smith, and C. O'Sullivan, Batterer
Intervention: Program Approaches and Criminal Justice Strategies, Issues and
Practices Washington, DC: U.S. Department of Justice, National Institute of
Justice, February 1998. NCJ 168638, p. 10.

17. But see Gondolf, E.W., "Patterns of Reassault in Batterer Programs,"
Violence and Victims 12(4)(1997): 373-87.

18. "Experiment Demonstrates How to Hold Batterers Accountable," National
Institute of Justice Journal 244 (July 2000): 29.

19. Hafemeister, T.L., "Legal Aspects of the Treatment of Offenders With
Mental Disorders," in R.M. Wettstein, ed., Treatment of Offenders With
Mental Disorders New York: Guilford Press, 1998: 44-125.

20. Howard, K.I., K. Moras, and W. Lutz, "Evaluation of Psychotherapy:
Efficacy, Effectiveness, and Patient Progress," American Psychologist
51(10)(1996): 1059-1064.

21. Taxman, F.S., "12 Steps to Improved Offender Outcomes: Developing
Responsive Systems of Care for Substance-Abusing Offenders," Corrections
Today 60(6)(1998): 114-117, 166.

22. Rossi, P.H., H.E. Freeman, and M.W. Lipsey, Evaluation: A Systematic
Approach.

23. Gondolf, E.W., "Batterer Programs: What We Know and Need to Know."

----------------------------

About the Author

Shelly Jackson is a program manager in NIJ's Office of Research and
Evaluation.

----------------------------

About the National Institute of Justice

NIJ is the research, development, and evaluation agency of the U.S.
Department of Justice. The Institute provides objective, independent, evidence-
based knowledge and tools to enhance the administration of justice and public
safety. NIJ's principal authorities are derived from the Omnibus Crime Control
and Safe Streets Act of 1968, as amended (see 42 U.S.C. [section] 3721-
3723).

The NIJ Director is appointed by the President and confirmed by the Senate.
The Director establishes the Institute's objectives, guided by the priorities of the
Office of Justice Programs, the U.S. Department of Justice, and the needs of
the field. The Institute actively solicits the views of criminal justice and other
professionals and researchers to inform its search for the knowledge and tools
to guide policy and practice.

Strategic Goals

NIJ has seven strategic goals grouped into three categories: 

Creating relevant knowledge and tools

1. Partner with State and local practitioners and policymakers to identify social
science research and technology needs. 

2. Create scientific, relevant, and reliable knowledge--with a particular
emphasis on terrorism, violent crime, drugs and crime, cost-effectiveness, and
community-based efforts--to enhance the administration of justice and public
safety. 

3. Develop affordable and effective tools and technologies to enhance the
administration of justice and public safety. 

Dissemination

4. Disseminate relevant knowledge and information to practitioners and
policymakers in an understandable, timely, and concise manner. 

5. Act as an honest broker to identify the information, tools, and technologies
that respond to the needs of stakeholders. 

Agency management

6. Practice fairness and openness in the research and development process.

7. Ensure professionalism, excellence, accountability, cost-effectiveness, and
integrity in the management and conduct of NIJ activities and programs. 

Program Areas

In addressing these strategic challenges, the Institute is involved in the following
program areas: crime control and prevention, including policing; drugs and
crime; justice systems and offender behavior, including corrections; violence
and victimization; communications and information technologies; critical incident
response; investigative and forensic sciences, including DNA; less-than-lethal
technologies; officer protection; education and training technologies; testing and
standards; technology assistance to law enforcement and corrections agencies;
field testing of promising programs; and international crime control. 

In addition to sponsoring research and development and technology assistance,
NIJ evaluates programs, policies, and technologies. NIJ communicates its
research and evaluation findings through conferences and print and electronic
media.

To find out more about the National Institute of Justice, please contact:

National Criminal Justice Reference Service
P.O. Box 6000
Rockville, MD 20849-6000
800-851-3420
e-mail: askncjrs@ncjrs.org