๐Ÿ” Data Flagging Results

Comprehensive Analysis of Flagged Vehicles in OBFCM
2021-2023 Dataset

๐Ÿ“„ Download PDF Report

๐Ÿ” Flagged Vehicles Pattern Analysis

๐Ÿ“… Date: December 24, 2024
๐Ÿ“Š Dataset: OBFCM 2021-2023 PHEV data
Original dataset: 7,732,320 records; PHEV subset: 995,511 records


๐ŸŽฏ Analysis Focus: Identifying patterns, correlations, and interesting connections in flagged vehicles


๐Ÿ“‘ Table of Contents

  1. Executive Summary
  2. Detailed Findings by Filter Step
  3. Cross-Step Patterns
  4. References

๐Ÿ“‹ Executive Summary

Analysis of flagged vehicles reveals several significant patterns:

Filter Step Vehicles Flagged Key Finding
Step 1 (CS Invalid) 33,348 Dominated by Stellantis group vehicles (Fiat, Jeep, Opel) and Mitsubishi Eclipse Cross
Step 2 (Missing RW_EC) 12,554 Strongly associated with Hyundai models (Tucson, Santa Fe) and Ford vehicles
Step 4 (RW_EC Zero) 18,779 Heavily dominated by Porsche Panamera and Cayenne E-Hybrid models
Step 3 (Missing OEM/Model) 5,753 Vehicles with missing manufacturer or model information
Step 5 (VFN Issue) 32,363 Volvo/Polestar models and Geely vehicles overrepresented
Step 6 (Physics CO2/FC) 420 Physically implausible COโ‚‚ or fuel consumption values
Step 7 (Mileage/FC Inconsistency) 565 Logical impossibilities in mileage and fuel consumption
Step 8 (EDS/Energy Violation) 3,506 EDS or energy values outside acceptable ranges

Key Insights

  1. Manufacturer Patterns - Stellantis group shows consistent data quality issues across multiple steps
  2. Vehicle Characteristics - Flagged vehicles show systematic differences in mass, engine power, and electric range
  3. Model-Specific Issues - Certain models (Porsche Panamera, Hyundai Tucson, Mitsubishi Eclipse Cross) show extreme overrepresentation
  4. Data Quality Concerns - Patterns suggest both genuine vehicle characteristics and potential data reporting issues
Figure 1: Flag Totals by Step
Figure 1: Flag Totals by Step
Figure 2: OEM Distribution by Step
Figure 2: OEM Distribution by Step

๐Ÿ“Š Detailed Findings by Filter Step

๐Ÿ”ด Step 1: CS Invalid (33,348 vehicles flagged)

What it means: Vehicles with invalid Charge-Sustaining (CS) mode data

Top Manufacturers

Manufacturer Vehicles Flagged Percentage of Flagged Overrepresentation
Jaguar Land Rover Limited 5,654 16.95% -
Fiat Group 5,561 16.68% -
Volkswagen 4,091 12.27% -
Ford Werke GmbH 3,903 11.70% -
Skoda 2,991 8.97% -

Top Models (by Overrepresentation)

Model Overrepresentation
Mitsubishi Eclipse Cross 19,345% โš ๏ธ
Volkswagen Passat 10,020% โš ๏ธ
Jaguar E-PACE P300E R-Dynamic 9,502% โš ๏ธ
Opel Grandland X 8,542% โš ๏ธ

Vehicle Characteristics Comparison

Characteristic Flagged Vehicles Clean Vehicles Difference Direction
Mass 2,060 kg 2,121 kg -2.89% โฌ‡๏ธ Lighter
TA_COโ‚‚ 37.4 g/km 35.2 g/km +6.10% โฌ†๏ธ Higher
Electric Range 77.5 km 63.3 km +22.32% โฌ†๏ธ Longer
Engine Displacement 1,642 cc 1,863 cc -11.84% โฌ‡๏ธ Smaller
Engine Power 123 kW 141 kW -12.92% โฌ‡๏ธ Lower
Total Mileage 19,807 km 26,443 km -25.10% โฌ‡๏ธ Lower

๐Ÿ’ก Key Insight

Vehicles flagged in Step 1 tend to be lighter and have smaller engines, but paradoxically show higher COโ‚‚ emissions and longer electric range. This suggests potential issues with: - Charge-sustaining mode operation - Data reporting in these specific models - Possible calibration or sensor issues

๐ŸŒ Online Research Context


๐ŸŸ  Step 2: Missing RW_EC (12,554 vehicles flagged)

What it means: Vehicles missing Real-World Electric Consumption (RW_EC) data

Top Manufacturers

Manufacturer Vehicles Flagged Percentage of Flagged
Ford Werke GmbH 3,720 29.63%
Stellantis Auto 3,352 26.70%
Skoda 2,603 20.73%
BMW AG 597 4.76%
Hyundai Czech 590 4.70%

Top Models (by Overrepresentation)

Model Overrepresentation
Hyundai Tucson/Tucson IX35 216,430% โš ๏ธโš ๏ธ
Hyundai Santa Fe 110,157% โš ๏ธโš ๏ธ

Vehicle Characteristics Comparison

Characteristic Flagged Vehicles Clean Vehicles Difference Direction
Mass 1,926 kg 2,122 kg -9.21% โฌ‡๏ธ Lighter
TA_COโ‚‚ 28.9 g/km 35.4 g/km -18.20% โฌ‡๏ธ Lower
Engine Power 120 kW 141 kW -14.85% โฌ‡๏ธ Lower
Total Mileage 18,907 km 26,314 km -28.15% โฌ‡๏ธ Lower
FC_Tot (Total Fuel Consumption) 1,011 L 1,610 L -37.18% โฌ‡๏ธ Lower

๐Ÿ’ก Key Insight

Missing RW_EC is strongly associated with Hyundai models (Tucson, Santa Fe) and Ford vehicles. These vehicles tend to be: - Lighter than average - Have lower emissions - Have lower total fuel consumption - May be newer models or have different monitoring systems

๐ŸŒ Online Research Context


๐ŸŸก Step 3: Missing OEM/Model (5,753 vehicles flagged)

What it means: Vehicles with missing manufacturer or model information

Top Manufacturers

Manufacturer Vehicles Flagged Percentage of Flagged
Volvo 1,270 22.06%
Peugeot 461 8.00%
Missing (NA) 449 7.80%
Audi 224 3.89%
SEAT 117 2.03%

๐Ÿ’ก Key Insight

Step 3 flags vehicles with missing OEM or Model information. This is a data quality issue that affects vehicle identification and analysis. Volvo vehicles show the highest number of missing information cases, followed by Peugeot.


๐Ÿ”ต Step 4: RW_EC Zero (18,779 vehicles flagged)

What it means: Vehicles reporting zero Real-World Electric Consumption (indicating no electric mode usage)

Top Manufacturers (by Overrepresentation)

Manufacturer Overrepresentation
Porsche 225,325% โš ๏ธโš ๏ธโš ๏ธ
Ferrari 420%
Suzuki 378%

Top Models (by Overrepresentation)

Model Overrepresentation
Porsche Panamera 4S E-Hybrid 4,426,839% โš ๏ธโš ๏ธโš ๏ธ
Porsche Panamera 4 E-Hybrid 2,691,396% โš ๏ธโš ๏ธโš ๏ธ
Porsche Panamera 4 1,187,172% โš ๏ธโš ๏ธโš ๏ธ
Porsche Cayenne E-Hybrid 164,224% โš ๏ธโš ๏ธ

Vehicle Characteristics Comparison

Characteristic Flagged Vehicles Clean Vehicles Difference Direction
Mass 2,350 kg 2,115 kg +11.11% โฌ†๏ธ Heavier
TA_COโ‚‚ 57.3 g/km 34.9 g/km +64.36% โฌ†๏ธ Much Higher
Electric Range 53.3 km 64.0 km -16.75% โฌ‡๏ธ Shorter
Engine Displacement 2,513 cc 1,842 cc +36.39% โฌ†๏ธ Much Larger
Engine Power 213 kW 139 kW +52.74% โฌ†๏ธ Much Higher
FC_Tot (Total Fuel Consumption) 2,198 L 1,591 L +38.20% โฌ†๏ธ Higher

๐Ÿ’ก Key Insight

This is the most striking pattern - Porsche luxury PHEVs (Panamera, Cayenne) are reporting zero electric consumption. These are: - High-performance vehicles - Heavy vehicles (2,350 kg average) - Large engines (2,513 cc average) - High power (213 kW average)

Possible explanations for zero RW_EC: 1. ๐Ÿ”‹ Battery issues preventing electric mode operation 2. ๐Ÿš— Driver behavior (not charging vehicles) 3. ๐Ÿ“Š Data reporting problems specific to Porscheโ€™s OBFCM implementation 4. โš™๏ธ Design issues where electric mode is rarely engaged


๐ŸŸฃ Step 5: VFN Issue (32,363 vehicles flagged)

What it means: Vehicle Family Name (VFN) validation failures

Top Manufacturers (by Overrepresentation)

Manufacturer Overrepresentation
Geely 529,641% โš ๏ธโš ๏ธโš ๏ธ
Ferrari 3,379%
Opel Automobile 1,619%

Top Models (by Overrepres

๐ŸŸข Step 6: Physics CO2/FC (420 vehicles flagged)

What it means: Vehicles with physically implausible COโ‚‚ or fuel consumption values

Criteria: FCgap_perc > 1800 OR TA_CO2 >= 190 OR RW_CO2 > 800

Top Manufacturers

Manufacturer Vehicles Flagged Percentage of Flagged
Mercedes-Benz 280 66.67%
Land Rover 99 23.57%
Ferrari 14 3.33%
BMW 8 1.90%
Volvo 6 1.43%

๐Ÿ’ก Key Insight

Step 6 flags vehicles with physically impossible values, suggesting data reporting errors or sensor malfunctions. Mercedes-Benz vehicles, particularly GLC models, show the highest incidence of physics violations.


๐ŸŸ  Step 7: Mileage/FC Inconsistency (565 vehicles flagged)

What it means: Vehicles with inconsistent mileage and fuel consumption relationships

Criteria: Logical impossibilities such as fuel consumption without distance, or large distance with zero fuel

Breakdown by Inconsistency Type

Inconsistency Type Vehicles Flagged
CD engine-on (mileage=0, FC>0.1) 196
CS zero distance (mileage=0, FC>0.1) 27
CI zero distance (mileage=0, FC>3) 359
CS large distance, zero fuel (mileage>100, FC=0) 6

๐Ÿ’ก Key Insight

Step 7 identifies logical impossibilities in the data, such as reporting fuel consumption without corresponding distance traveled. These inconsistencies suggest data collection or reporting errors.


๐Ÿ”ต Step 8: EDS/Energy Violation (3,506 vehicles flagged)

What it means: Vehicles with EDS or energy values outside acceptable ranges

Criteria: EDS outside 0-100%, negative energy values, or energy accounting inconsistencies

Breakdown by Violation Type

Violation Type Vehicles Flagged
EDS bounds violations (outside 0-100%) 4,396
Negative energy values 0
Energy identity violations 0

Top Manufacturers

Manufacturer Vehicles Flagged Percentage of Flagged
Missing (NA) 21,395 85.12%
Jeep 3,210 12.77%
Opel 322 1.28%
Peugeot 300 1.19%
Volvo 245 0.97%

๐Ÿ’ก Key Insight

Step 8 flags vehicles with EDS values outside the physically possible range of 0-100%. Most violations occur in vehicles with missing OEM information, suggesting data quality issues in identification and reporting.


entation)
Model Overrepresentation
Volvo V60 T6 Twin Engine 133,677% โš ๏ธโš ๏ธ
Polestar 1 111,381% โš ๏ธโš ๏ธ
Volvo XC90 T8 Twin Engine 92,057% โš ๏ธโš ๏ธ

๐Ÿ’ก Key Insight

VFN issues are primarily with: - Volvo/Polestar models (owned by Geely) - Other Geely-owned brands

This suggests: - VFN whitelist may need updating - Naming inconsistencies in Geelyโ€™s reporting - Corporate structure changes affecting data standardization

๐ŸŒ Online Research Context


๐Ÿ”—

๐Ÿ“Š Flag Combination Visualizations

The following figures show the distribution of flag combinations across all vehicles. Flag codes represent which filtering steps were triggered:

How combinations work: Multi-digit codes combine steps. Examples: '5' = Step 5 only (VFN Issue), '12' = Steps 1+2 (CS Invalid + Missing RW_EC), '45' = Steps 4+5 (RW_EC Zero + VFN Issue), '58' = Steps 5+8 (VFN Issue + EDS/Energy Violation)

Figure 3: Flag Combinations
Figure 3: Flag Combinations
Figure 4: Detailed Flag Combinations
Figure 4: Detailed Flag Combinations
Figure 5: Clean vs Flagged Vehicle Characteristics
Figure 5: Clean vs Flagged Vehicle Characteristics

Cross-Step Patterns

๐Ÿข Stellantis Group (Fiat, Opel, Peugeot, Chrysler, Jeep)

Consistent Issues Across Multiple Steps:

Filter Step Issue Type Affected Brands
Step 1 CS Invalid Fiat, Opel, Jeep
Step 2 Missing RW_EC Stellantis Auto
Step 3 Missing OEM/Model Stellantis Auto, Peugeot
Step 8 EDS/Energy Violation Stellantis Europe, Fiat, Opel, PSA

Analysis

The consistent data quality issues across Stellantis brands suggest:

  1. Common OBFCM implementation across brands
  2. Shared data reporting infrastructure
  3. Potential systemic issues in their PHEV monitoring systems

๐ŸŒ Online Research Context


๐Ÿ“š References

Online Research Sources

  1. Jeep/Chrysler Quality Issues:
  2. Hyundai/Kia Recalls:
  3. Automotive Recall Statistics:
  4. Stellantis Formation:
    • Stellantis formed in 2021 from merger of FCA (Fiat Chrysler Automobiles) and PSA (Peugeot Sociรฉtรฉ Anonyme)
  5. Geely-Volvo Relationship:
    • Geely acquired Volvo Cars in 2010, explaining VFN naming inconsistencies

Data Sources


๐Ÿ“Œ Important Notes

โš ๏ธ Disclaimer: This analysis is based on OBFCM data from 2021-2023. Patterns may reflect both genuine vehicle characteristics and data quality issues. Further investigation with manufacturers is recommended for flagged models.

Recommendations

  1. Contact manufacturers for flagged models to verify data reporting accuracy
  2. Update VFN whitelist to include Geely/Volvo naming variations
  3. Investigate Porsche OBFCM implementation for zero RW_EC reporting
  4. Review Stellantis data pipeline for systemic issues
  5. Monitor Hyundai models for missing RW_EC data in future datasets

Document Version: 1.0
Last Updated: December 24, 2024
Author: Data Analysis Team