Sentiment Analysis in Java: Fast and Free
In the world of financial technology, sentiment analysis has become a crucial tool for understanding market dynamics and investor behavior. While Large Language Models (LLMs) like GPT-4 can provide excellent sentiment analysis, they come with costs and latency concerns that make them less suitable for high-frequency analysis or applications where speed and cost-effectiveness are paramount.
In this article, weβll explore how to implement a fast, free, and highly accurate financial sentiment analysis system using a pre-trained DistilRoBERTa model from Hugging Face, converted to TorchScript, and integrated into a Spring Boot application using the Deep Java Library (DJL).
This is the same approach I used to build BullSentiment.com - a real-time stock sentiment analysis platform that processes thousands of financial news articles daily to provide actionable market insights.
Repository
The complete source code for this tutorial is available on GitHub:
https://github.com/gaetanopiazzolla/sentiment-analysis-java
Quick Start
If you want to get up and running quickly, hereβs what you need:
Prerequisites
- Java 24+ (or adjust
build.gradle.ktsfor your version) - Python 3.10+ (for model conversion only)
- Gradle 8+
Steps
- Clone the repository:
git clone https://github.com/gaetanopiazzolla/sentiment-analysis-java.git cd sentiment-analysis-java - Convert the model to TorchScript:
cd model-conversion-script python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt python3 convert_to_torchscript.py cd .. - Run the application:
./gradlew bootRun - Test the API:
curl "http://localhost:8080/api/sentiment/analyze?text=The%20company%20reported%20record%20profits%20this%20quarter"
Thatβs it! You now have a working sentiment analysis API.
Why Choose This Approach Over LLMs?
While LLMs offer impressive capabilities, our approach provides several key advantages:
- Speed: Local model inference is significantly faster than API calls
- Cost: Completely free after initial setup - no per-request charges
- Privacy: All processing happens locally, no data leaves your infrastructure
- Consistency: Deterministic results without the variability of LLM responses
- Scalability: No rate limits or API quotas to worry about
The Model: DistilRoBERTa Fine-tuned for Financial News
Weβre using the mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis model from Hugging Face. This model is specifically fine-tuned on the financial_phrasebank dataset, achieving 98% accuracy on financial sentiment classification.
The model has 6 layers, 768 dimensions, and 12 heads, totaling 82M parameters. On average, DistilRoBERTa is twice as fast as RoBERTa-base while maintaining excellent accuracy.
Project Structure
sentiment-analysis-java/
βββ build.gradle.kts
βββ model-conversion-script/
β βββ convert_to_torchscript.py
β βββ requirements.txt
β βββ (model files after conversion)
βββ src/main/java/gae/piaz/sentiment/
βββ SentimentAnalysisApplication.java
βββ config/
β βββ SentimentModelConfig.java
βββ controller/
β βββ SentimentController.java
βββ service/
βββ SentimentAnalyzerService.java
Model Conversion: From Transformers to TorchScript
To use this model with DJL in our Spring Boot application, we need to convert it from the Transformers format to TorchScript. The convert_to_torchscript.py script handles this:
import torch
from transformers import RobertaForSequenceClassification, RobertaTokenizer
model_name = "mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis"
# Load the model and tokenizer
tokenizer = RobertaTokenizer.from_pretrained(model_name)
model = RobertaForSequenceClassification.from_pretrained(model_name)
model.eval()
# Create example input for tracing
sample_text = "Operating profit totaled EUR 9.4 mn, down from EUR 11.7 mn in 2004."
inputs = tokenizer(sample_text, return_tensors="pt", padding=True, truncation=True, max_length=512)
# Trace the model
with torch.no_grad():
traced_model = torch.jit.trace(
model,
(inputs["input_ids"], inputs["attention_mask"]),
strict=False
)
# Save the traced model
traced_model.save("model.pt")
The script also downloads the necessary tokenizer files (config.json, tokenizer.json, merges.txt, vocab.json) that DJL needs for text preprocessing.
Spring Boot Integration
Gradle Dependencies
The build.gradle.kts includes the necessary dependencies:
plugins {
java
id("org.springframework.boot") version "3.4.5"
id("io.spring.dependency-management") version "1.1.7"
}
java {
toolchain {
languageVersion = JavaLanguageVersion.of(24)
}
}
dependencies {
implementation("org.springframework.boot:spring-boot-starter-web")
implementation("org.springframework.boot:spring-boot-starter-actuator")
// DJL (Deep Java Library) for ML inference
implementation(platform("ai.djl:bom:0.33.0"))
implementation("ai.djl:api")
runtimeOnly("ai.djl.pytorch:pytorch-engine")
implementation("ai.djl.huggingface:tokenizers")
implementation("ai.djl:model-zoo")
}
tasks.withType<JavaExec> {
jvmArgs(
"--sun-misc-unsafe-memory-access=allow",
"--enable-native-access=ALL-UNNAMED"
)
}
Note the JVM arguments - these are required for DJL to work properly with newer Java versions.
Application Configuration
The application.yaml configuration is minimal:
sentiment:
model-dir: ${MODEL_DIR:model-conversion-script}
logging:
level:
gae.piaz.sentiment: INFO
The model-dir points to the directory containing the converted model files.
Model Configuration Bean
The SentimentModelConfig class initializes the DJL predictor:
@Configuration
public class SentimentModelConfig {
private static final Logger logger = LoggerFactory.getLogger(SentimentModelConfig.class);
@Value("${sentiment.model-dir}")
private String modelDir;
private ZooModel<String, Classifications> model;
@Bean
public Predictor<String, Classifications> sentimentPredictor() {
try {
logger.info("Loading DistilRoBERTa financial news sentiment analysis model...");
Criteria<String, Classifications> criteria = Criteria.builder()
.setTypes(String.class, Classifications.class)
.optModelPath(Paths.get(modelDir))
.optModelName("model.pt")
.optOption("modelDir", modelDir)
.optTranslatorFactory(new TextClassificationTranslatorFactory())
.optProgress(new ProgressBar())
.build();
model = criteria.loadModel();
return model.newPredictor();
} catch (ModelException | IOException e) {
throw new RuntimeException("Failed to initialize sentiment analysis model", e);
}
}
@PreDestroy
public void cleanup() {
if (model != null) {
model.close();
}
}
}
The key components here are:
TextClassificationTranslatorFactory- handles tokenization and output parsing automaticallyoptModelPath- points to the directory with the model filesoptModelName- specifies the TorchScript model file
Sentiment Analysis Service
The service layer performs the actual inference:
@Service
public class SentimentAnalyzerService {
private final Predictor<String, Classifications> predictor;
public SentimentAnalyzerService(Predictor<String, Classifications> predictor) {
this.predictor = predictor;
}
public Double analyzeSentimentSimple(String text) {
try {
Classifications result = predictor.predict(text);
return calculateSentimentScore(result.items());
} catch (Exception e) {
return 0.0; // Return neutral sentiment on error
}
}
private double calculateSentimentScore(List<Classifications.Classification> classifications) {
double positiveScore = 0.0;
double negativeScore = 0.0;
double neutralScore = 0.0;
for (Classifications.Classification classification : classifications) {
double weight = classification.getProbability();
String className = classification.getClassName().toLowerCase();
switch (className) {
case "positive" -> positiveScore += weight;
case "negative" -> negativeScore += weight;
case "neutral" -> neutralScore += weight;
}
}
double totalWeight = positiveScore + negativeScore + neutralScore;
if (totalWeight == 0.0) return 0.0;
double pPos = positiveScore / totalWeight;
double pNeg = negativeScore / totalWeight;
double pNeu = neutralScore / totalWeight;
// Score ranges from -1 (negative) to +1 (positive), dampened by neutral
return (pPos - pNeg) * (1.0 - pNeu);
}
}
The sentiment score is calculated as a value between -1 (fully negative) and +1 (fully positive), with the neutral probability dampening the extremes.
REST Controller
The controller exposes a simple API:
@RestController
@RequestMapping("/api/sentiment")
@CrossOrigin
public class SentimentController {
private final SentimentAnalyzerService sentimentAnalyzerService;
public SentimentController(SentimentAnalyzerService sentimentAnalyzerService) {
this.sentimentAnalyzerService = sentimentAnalyzerService;
}
@GetMapping("/analyze")
public ResponseEntity<Double> analyzeSentimentSimple(@RequestParam String text) {
Double sentimentScore = sentimentAnalyzerService.analyzeSentimentSimple(text);
return ResponseEntity.ok(sentimentScore);
}
}
Testing the API
Once the application is running, you can test it with various financial texts:
# Positive sentiment
curl "http://localhost:8080/api/sentiment/analyze?text=The%20company%20reported%20record%20profits%20and%20exceeded%20all%20expectations"
# Returns: ~0.85
# Negative sentiment
curl "http://localhost:8080/api/sentiment/analyze?text=The%20stock%20plummeted%20after%20the%20company%20announced%20massive%20layoffs"
# Returns: ~-0.72
# Neutral sentiment
curl "http://localhost:8080/api/sentiment/analyze?text=The%20company%20held%20its%20annual%20shareholder%20meeting%20yesterday"
# Returns: ~0.05
Performance Comparison: Local Model vs LLMs
| Aspect | Local DistilRoBERTa | GPT-4/Claude |
|---|---|---|
| Cost | Free | $0.01-0.06 per 1K tokens |
| Speed | ~50-100ms | ~1-3 seconds |
| Privacy | Complete | Data sent to provider |
| Reliability | 99.9%+ | Depends on API uptime |
| Accuracy (Financial) | 98% | Very high but variable |
Real-World Application: BullSentiment.com
This exact architecture powers BullSentiment.com, where we:
- Process thousands of financial news articles daily
- Provide real-time sentiment scores for stocks
- Correlate sentiment trends with price movements
- Deliver actionable insights to traders and investors
The combination of speed, accuracy, and zero per-request costs makes this approach ideal for high-volume financial analysis applications.
Conclusion
While LLMs are powerful tools, theyβre not always the best solution for every problem. For financial sentiment analysis where speed, cost, and privacy are concerns, a well-chosen pre-trained model like DistilRoBERTa can provide excellent results with significant operational advantages.
This approach demonstrates that with the right tools and configuration, you can build a production-ready, enterprise-grade sentiment analysis system thatβs both fast and free. The combination of Spring Boot, DJL, and a domain-specific model creates a robust foundation for financial technology applications.
The key is understanding your requirements and choosing the right tool for the job. Sometimes, the best solution is simpler than you think.
Interested in seeing this in action? Check out BullSentiment.com for real-time stock sentiment analysis. The complete source code for this tutorial is available at github.com/gaetanopiazzolla/sentiment-analysis-java.