BACKGROUND Although risk factors for kidney transplant failure are well described, prognostic risk scores to estimate risk in prevalent transplant recipients are limited. STUDY DESIGN Development and validation of risk-prediction instruments. SETTING & PARTICIPANTS The development data set included 2,763 prevalent patients more than 12 months posttransplant enrolled into the LOTESS (Long Term Efficacy and Safety Surveillance) Study. The validation data set included 731 patients who underwent transplant at a single UK center. PREDICTOR Estimated glomerular filtration rate (eGFR) and other risk factors were evaluated using Cox regression. OUTCOME Scores for death-censored and overall transplant failure were based on the summed hazard ratios for baseline predictor variables. Predictive performance was assessed using calibration (Hosmer-Lemeshow statistic), discrimination (C statistic), and clinical reclassification (net reclassification improvement) compared with eGFR alone. RESULTS In the development data set, 196 patients died and another 225 experienced transplant failure. eGFR, recipient age, race, serum urea and albumin levels, declining eGFR, and prior acute rejection predicted death-censored transplant failure. eGFR, recipient age, sex, serum urea and albumin levels, and declining eGFR predicted overall transplant failure. In the validation data set, 44 patients died and another 101 experienced transplant failure. The weighted scores comprising these variables showed adequate discrimination and calibration for death-censored (C statistic, 0.83; 95% CI, 0.75-0.91; Hosmer-Lemeshow χ(2)P = 0.8) and overall (C statistic, 0.70; 95% CI, 0.64-0.77; Hosmer-Lemeshow χ(2)P = 0.5) transplant failure. However, the scores failed to reclassify risk compared with eGFR alone (net reclassification improvements of 7.6% [95% CI, -0.2 to 13.4; P = 0.09] and 4.3% [95% CI, -2.7 to 11.8; P = 0.3] for death-censored and overall transplant failure, respectively). LIMITATIONS Retrospective analysis of predominantly cyclosporine-treated patients; limited study size and categorization of variables may limit power to detect effect. CONCLUSIONS Although the scores performed well regarding discrimination and calibration, clinically relevant risk reclassification over eGFR alone was not evident, emphasizing the stringent requirements for such scores. Further studies are required to develop and refine this process.