Making Simple Sentiment Analysis on Laravel
I was finally given a challenging project that wasn’t a CRUD task. This one is for gathering news data from Indonesian Mainstream Media and creating sentiment analysis. Since there are numerous open source scrapper libraries, like Puppeteer, Beautifulsoup, Selenium, and others, I believe there are numerous alternatives to be found for the component of the problem involving gathering news. So, I’ve focused more on figuring out how to create sentiment analysis that supports PHP rather to Python (Because I’m using Laravel as main framework) and can support Indonesian language. Fortunately, I discovered this source called php-sentianalysis-id that meets all of my requirements.
Diving into the library
This library was built by Muhammad Nur Yasir Utomo and was a modification from James Hennessey’s project named phpinsight. Some changes are made in the dataset (lib/PHPInsight/dictionaries and lib/PHPInsight/data) that originally use English Langguage converted to Indonesian Langguage. To be spesific, the list of words of positive and negative in lib/PHPInsight/dictionaries and lib/PHPInsight/data are generated by using modified Devid Haryalesmana’s list of words on his project, ID-OpinionWords. List of words in ignore, neutral and prefix data are original words list from phpInsight with modification/translation to Indonesia. The classifier use dictionary of words that is categories as positive, neutral, and negative. The calculation of possible sentiment is calculated by Naive Bayes Algorithm. The accuracy can be improved by modified the dictionary and algorithm.
If we look the repository, the core logic of this library was put on the directory lib/PHPInsight that contained some files and directoris as follows:
data
: This directory contained the data sources used in the sentiment analysis process. It included various PHP files, each housing a list of words categorized by their sentiment connotation (positive, negative, neutral, or ignored).dictionaries
: This directory presumably contained a list of categorized words used by the classifiers.Sentiment.php
: This file is is a crucial component of thephp-sentianalysis-id
library. It’s part of the PHPInsight namespace and defines the Sentiment class, which is central to the sentiment analysis process.Autoloader.php
: This script automatically loads PHP classes when needed.
Implementing on my project
Because the library was writing on PHP Language and have adopt modern module system with autoload.php
, I don’t face any blocker to implement the library on my project. The library can call smoothly without need to explicitly require it in my project code. I’m just need to copy/clone the lib
folder on my root directory project and make require_once
to autoload.php
in my index.php
or my controller (the alternative way you can add it on composer.json
schema with autload properties, see the configuration here). The next step is integrate it with collected data news. I’m just using general SQL Databases like MySQL to store and load collected data news. As an illustration, I created several table columns like article title, categories, date, author and 2 columns for store article body. The first column is for data body with html tags
and the second one is data body with text only. FYI you can easily to remove text that contain html tags on php with strip_tag
function.
<?php
$text = '<p>This is some <strong>bold</strong> text.</p>';
$clean_text = strip_tags($text);
echo $clean_text; // Output: This is some bold text.
?>
This is the example code how I integrate the library with articles data on controller files
<?php
namespace App\Http\Controllers;
use App\Models\Article;
use Illuminate\Http\Request;
class ArticleController extends Controller
{
/**
* Display a listing of the resource.
*
* @return \Illuminate\Http\Response
*/
public function index()
{
// Include the SentimentAnalysis library
require_once app_path('Path/To/php-sentianalysis-id/autoload.php');
// Get all articles
$articles = Article::all();
// Initialize the SentimentAnalysis class
$sentiment = new \PHPInsight\Sentiment();
// Iterate over the articles
foreach ($articles as $article) {
// Get the clean_body text
$text = $article->clean_body;
// Calculate the sentiment scores
$scores = $sentiment->score($text);
// Categorize the sentiment
$category = $sentiment->categorise($text);
// Re-label the category
if ($category == 'pos' || $category == 'neu') {
$categoryLabel = 'Positif';
} else {
$categoryLabel = 'Negatif';
}
// Print the scores and category
echo 'Article: ' . $article->title . '<br>';
echo 'Sentiment scores: ' . json_encode($scores) . '<br>';
echo 'Sentiment category: ' . $categoryLabel . '<br><br>';
}
}
}
The illustration results for code above is look like this (with some improvement on controller response and views file)
Demo Code
Off course I can’t share or demo my whole code because it was private project. But instead, I have deploy the library php-sentianalysis-id
on phpsandbox.io. So you can see the live demo on https://3qeek.ciroue.com/ or just see it on below
References
https://github.com/yasirutomo/php-sentianalysis-id/