Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / MySQL

Design UTF-8 Encoded Web Application With Spring Boot, AngularJS and MySQL

4.17/5 (3 votes)
14 Apr 2021MIT15 min read 28.5K   87  
Create a simple web app to support UTF-8 encoded strings
This tutorial will show how to create a simple web application that can support UTF-8 encoded strings using Spring Boot, AngularJS, and MySQL.

Introduction

When I started programming, I tried to build non-English web sites. It was Geocity, all static pages, and I had to work with code pages (multiple bytes represent a single characters), specifically with Chinese such as GB2312, GB18032, or BIG5 character encoding. At the same time, international committees are trying to make UNICODE (16 bits and 32 bits) character sets. Then UTF-8 encoding came out and becomes a standard, it is a variable bit-length characters encoding. This is even worse than the code pages, because different characters can have different number of bytes represent them. It could be a nightmare if someone has to write a decoder for such characters encoding. The advantage of such encoding is that it is the most flexible way of adding more and more characters into the character set. And over the years, after it became the standard, all the OSes supports UTF-8, browsers supports it, and more and more web sites and pages uses it, it was the only way that supports the display of all languages of this planet (plus some more).

Over these years, I didn't have to deal with any of these. I work in U.S. and all I have to deal with is ASCII. And while I am busy with my own life, the world moves on. Technology evolves, and handling UTF-8 encoding becomes really easy. The hard problem fifteen years ago would be resolved easily. I guess if I ignore the problem long enough, the problem really goes away. Or it is dumbed down enough that it can be resolved with much trouble.

The purpose for this tutorial is documenting all the necessary steps to setup a Spring Boot based web application that can handle UTF-8 encoding correctly. I can reference back to this tutorial if I ever needed to do something similar. I am sure in the near future I will being adding this into my new project. And having this where I can find it would be very useful.

A quick disclaimer, I created this tutorial and its sample application without going into in-depth research. So my knowledge on this is limited. The sample application that came with this tutorial is just a starting point, it is not everything you need to know or must know. There will be gaps and could create potential problems for you, please do your own research to resolve these issues when they arise. I hope this doesn't scare you away, if so, please read on.

Image 1

Overall Architecture

I used MySQL to store the data. For simplicity sake, there is only one table. The database and its table is set to use UTF-8 charset. The sample application is designed with Spring Boot, a simple controller handles the requests from the web page. There is only one page, and AngularJS application handles the user interaction with the page.

To get all these to work, a few setup/configuration must be done:

  • Setup the database and table to handle UTF8.
  • Setup the web page so that it can support full page display and use of UTF8.
  • On the web page, the form has to be configured to handle UTF-8 data (for request and response) correctly.
  • On the server side, you need to make sure that UFT8 string can be stored into database and can be retrieved back.

Other than these extra steps, the configuration of the web application is the same as the web applications that don't support UTF8. In the next few sections, I will discuss how these setups can be done. Let's begin with, the database and table creation.

MySQL Database and Table

The very first thing I did is look for the configuration I have to do for MySQL database and for creating the table. There are a lot of resources out there that explain in detail how to do it. The key is that the charset must be set as: utf8mb4. The reason why you can't use UTF-8 is that it is not the same charset of standard UTF-8. This UTF-8 charset which MySQL used is a customized character encoding. So forget about the UTF-8 encoding and use utf8mb4. If you need more information, please check online for them.

Here is my SQL script to create the database, the access user and configure the character set for the database:

SQL
DROP DATABASE IF EXISTS `utf8testdb`;

DROP USER IF EXISTS 'utf8tdbuser'@'localhost';

CREATE DATABASE `utf8testdb`;

CREATE USER 'utf8tdbuser'@'localhost' IDENTIFIED BY '888$Test$888';

GRANT ALL PRIVILEGES ON `utf8testdb`.* TO 'utf8tdbuser'@'localhost';

FLUSH PRIVILEGES;

ALTER DATABASE `utf8testdb` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

The first two lines are to drop the database and drop the user if either exists. The third line creates the database, and fourth line creates the user and set the password. The fifth line grants all the privileges of the newly created database to the user. The sixth line will flush the privileges so that they are available to the user. The last line is the most important one, it alters the database and sets the character set used by the database to utf8mb4, and using the rules utf8mb4_unicode_ci to define the character sequence and sort rules for the strings.

Creating the table to store the data is very simple, here it is:

SQL
USE `utf8testdb`;

DROP TABLE IF EXISTS `testcontent`;

CREATE TABLE `testcontent` (
   `id` VARCHAR(34) NOT NULL PRIMARY KEY,
   `subject` VARCHAR(512) NOT NULL,
   `content` MEDIUMTEXT NOT NULL
);

Because the entire database is set to use the proper UTF-8 charset, there is no need to set UTF-8 specific configuration on the table level. If you don't want to configure the entire database to use the proper UTF-8 character set, you can set UTF-8 on the table level. I am sure it is possible.

Once the table is created, to test it, I can create a simple insert statement and insert some dummy data in and see if it works. Then use a simple query to retrieve it and see if the value is displayed correctly. Here is a screenshot of the test insert and query statements:

Image 2

Next, I will discuss how to create the HTML page to support UTF8.

HTML Page for UTF-8 Data Handling

For demonstration purposes, the web application is simple. It has a form that allows the user to enter a message subject and a message content, then a click of a button. The subject and content will be sent to the back end, to be stored. Once the storing completes, the server side will do a query and get a list of all the stored messages. To support UTF-8 characters handling, I have to do two things to the web page. One is to declare the page is using UTF-8 charset for displaying; the other is to set the charset for data exchange using UTF8. They are simple. Here is the source code of the entire page for the web application:

HTML
<!DOCTYPE html>
<html>
   <head>
      <title>AngularJS HTML Editor Directive - by Han</title>
      <meta http-equiv="Content-Type" content="text/html; 
      charset=utf-8">
      <link href="/assets/bootstrap/css/bootstrap.min.css" rel="stylesheet">
      <link href="/assets/styles/index.css" rel="stylesheet">    
   </head>
   <body>
      <div class="container" ng-app="testSampleApp" 
      ng-controller="testSampleController as vm">

         <div class="row">
            <div class="col-xs-12">
               <h3>UTF8 Web App</h3>
            </div>
         </div>
         <div class="row">
            <div class="col-xs-12">
               <form accept-charset="utf-8">
                  <div class="form-group">
                     <label for="subjectLine">Subject</label>
                     <input id="subjectLine" class="form-control" 
                     maxlength="100" ng-model="vm.subjectLine">
                  </div>
                  <div class="form-group">
                     <label for="subjectContent">Content</label>
                     <textarea id="subjectContent"
                               class="form-control fixed-area"
                               maxlength="32767"
                               ng-model="vm.subjectContent"
                               rows="5"></textarea>
                  </div>
               </form>
            </div>
            <div class="col-xs-6">
               <button class="btn btn-success" 
               ng-click="vm.saveUtf8Content()">Save</button>
               <button class="btn btn-default" 
               ng-click="vm.clearContent()">Clear</button>
            </div>
         </div>
         
         <div ng-if="vm.postsList && vm.postsList.length > 0">
	         <hr>
	         <div class="row" ng-repeat="post in vm.postsList">
	            <div class="col-xs-12">
	               <span>{{post.postId}}</span><br>
	               <span>{{post.title}}</span><br>
                  <div ng-bind-html="post.safeHtmlContent"></div>
                  <hr>
	            </div>
	         </div>
         </div>
         
         <footer class="footer">
            <p>© 2021, Han Bo Sun.</p>
         </footer>

      </div>
    
      <script type="text/javascript" 
      src="/assets/jquery/js/jquery.min.js"></script>
      <script type="text/javascript" 
      src="/assets/bootstrap/js/bootstrap.min.js"></script>
      <script type="text/javascript" 
      src="/assets/angularjs/1.7.5/angular.min.js"></script>
      <script type="text/javascript" 
      src="/assets/angularjs/1.7.5/angular-resource.min.js"></script>
      <script type="text/javascript" 
      src="/assets/app/js/app.js"></script>
      <script type="text/javascript" 
      src="/assets/app/js/testSampleService.js"></script>
    </body>  
</html>

To declare this web page is using UTF-8 for display, I have to add a meta tag in the head section of the page like this:

HTML
<html>
   <head>
      ...
      <meta http-equiv="Content-Type" 
      content="text/html; charset=utf-8">
      ...
   </head>
   ...
</html>

Here is the section where I define the form for handling user input and sending user data to the back end, I have highlighted the beginning of the form where I have to add the attribute "accept-charset". This will force the accept-charset to be added to the header for the request that sends the user data to the back end. This will ensure the character strings are properly represented in the request. Since the data is sent with ngResource, I am not sure if this is necessary. But I put it in the form just in case:

HTML
<div class="row">
   <div class="col-xs-12">
      <form accept-charset="utf-8">
         <div class="form-group">
            <label for="subjectLine">Subject</label>
            <input id="subjectLine" class="form-control" 
             maxlength="100" ng-model="vm.subjectLine">
         </div>
         <div class="form-group">
            <label for="subjectContent">Content</label>
            <textarea id="subjectContent"
                        class="form-control fixed-area"
                        maxlength="32767"
                        ng-model="vm.subjectContent"
                        rows="5"></textarea>
         </div>
      </form>
   </div>
   <div class="col-xs-6">
      <button class="btn btn-success" 
      ng-click="vm.saveUtf8Content()">Save</button>
      <button class="btn btn-default" 
      ng-click="vm.clearContent()">Clear</button>
   </div>
</div>

As shown above, the "subjectLine" input field is bound to the scope variable vm.subjectLine, and the "subjectContent" text area is bound to scope variable vm.subjectContent. There is nothing fancy. The button that sends the user request to back end server is defined as this:

HTML
<button class="btn btn-success"
ng-click="vm.saveUtf8Content()">Save</button>

There is also the place where I display all the messages that is stored in the database, I just list them out one by one with an ngRepeat, like this:

HTML
<div ng-if="vm.postsList && 
vm.postsList.length > 0">
   <hr>
   <div class="row" 
   ng-repeat="post in vm.postsList">
      <div class="col-xs-12">
         <span>{{post.postId}}</span><br>
         <span>{{post.title}}</span><br>
         <div ng-bind-html="post.safeHtmlContent"></div>
         <hr>
      </div>
   </div>
</div>

Anyways, that is all there is for the HTML code. Next, I will discuss the AngularJS code that handles the data exchange.

The AngularJS Code

The most essential part of the front end this web application, is the AngularJS code. I split this into two files. One is called app.js and the other is called testSampleService.js. The first file is the main entry of the front end application. The other is the service object that sends and receives data to and from the server.

Here is the source code for the file app.js:

JavaScript
(function () {
   "use strict";
   var mod = angular.module("testSampleApp", 
   [ "testSampleServiceModule" ]);
   mod.controller("testSampleController", 
   [ "$scope", "$sce", 
   "testSampleService", function ($scope, $sce, testSampleService) {
      var vm = this;
      
      vm.subjectLine = null;
      vm.subjectContent = null;
      
      vm.postsList = null;
      
      loadAllPosts();
      
      vm.saveUtf8Content = function () {
         var obj = {
            title: vm.subjectLine,
            content: vm.subjectContent
         };
         
         testSampleService.sendPost(obj).then(function (result) {
            if (result && result.status === true) {
               loadAllPosts();
            }
         }, function (error) {
            if (error) {
               console.log(error);
            }
         });
      };
      
      vm.clearContent = function () {
         vm.subjectLine = null;
         vm.subjectContent = null;
      };
      
      function loadAllPosts() {
         testSampleService.listAllPosts().then(function(results) {
            if (results && results.length > 0) {
               vm.postsList = results;
               
               angular.forEach(vm.postsList, function (post) {
                  if (post) {
                     post.safeHtmlContent = $sce.trustAsHtml(post.content);
                  }
               });
            };
         }, function (error) {
            if (error) {
               console.log(error);
            }
            vm.postsList = null;
         });
      }
   }]);
})();

As shown, there is nothing special I have to do to support UTF-8 based data exchange. One thing is for certain, AngularJS is a modern framework that supported characters encoding without much of configuration. I am sure all other frameworks would also support this well.

In the above code, the function that loads all the subject and content from the back end is called loadAllPosts(). Here it is:

JavaScript
function loadAllPosts() {
   testSampleService.listAllPosts().then(function(results) {
      if (results && results.length > 0) {
         vm.postsList = results;
         
         angular.forEach(vm.postsList, function (post) {
            if (post) {
               post.safeHtmlContent = $sce.trustAsHtml(post.content);
            }
         });
      };
   }, function (error) {
      if (error) {
         console.log(error);
      }
      vm.postsList = null;
   });
}

In the function, I call the service testSampleService' method listAllPosts() to get all the message content objects from the back end. Then I have to convert the content of the objects into a safe/displayable HTML string. That is the call of $sce.trustAsHtml().

The other important part of this app.js is the method that saves the message object to the backend. It is called vm.saveUtf8Content().

JavaScript
vm.saveUtf8Content = function () {
   var obj = {
      title: vm.subjectLine,
      content: vm.subjectContent
   };
   
   testSampleService.sendPost(obj).then(function (result) {
      if (result && result.status === true) {
         loadAllPosts();
      }
   }, function (error) {
      if (error) {
         console.log(error);
      }
   });
};

The service testSampleService.js is created for handling the data exchange with the back end. This is the whole file:

JavaScript
(function () {
   "use strict";
   var mod = angular.module("testSampleServiceModule", 
   [ "ngResource" ]);
   mod.factory("testSampleService", [ "$resource",
      function ($resource) {
         var svc = {
            sendPost: sendPost,
            listAllPosts: listAllPosts
         };
         
         var apiRes = $resource(null, null, {
            sendPost: {
               url: "/processPost",
               method: "post",
               isArray: false
            },
            listAllPosts: {
               url: "/listAllPosts",
               method: "get",
               isArray: true
            }
         });
         
         function sendPost(postInfo) {
            return apiRes.sendPost(postInfo).$promise;
         }
         
         function listAllPosts() {
            return apiRes.listAllPosts().$promise;
         }
         
         return svc;
      }
   ]);
})();

Again, there is nothing special needed to handle the data exchange. The back end service is RESTFul service, so ngResource will work. If there is any additional header info I need, I can add it to the API call. But I don't.

Next. I will discuss the back end service design. This is the part that was somewhat unknown to me. If you don't know, I started with C/C++ in the 90s. At that time, characters are represented as bytes, then there is the UNICODE characters of double bytes. Then the multi-byte character codes called the code pages, which is also double bytes. Then there is the 4-byte super UNICODE. Then UTF-8 comes as variable length, a character can be 1 byte, 2 byte, or 4 bytes. Java and C# both use UNICODE as for characters. This application uses Java so I was a bit unsure how UTF-8 characters are stored. This is where I admit that I don't know encoding as much as I could. The question is do I need to use any encoding conversion to convert between UNICODE and UTF-8.

Back End RESTFul Service API

Turns out, I don't need to do anything special either. The request with UTF-8 character string in it when passed to the back end service side, the string will be represented as UTF-8 and stored in a Java String object. It is not as a double byte UNICODE string or other encoding. This certainly is very convenient. Anyways, before I get into the heart of the back end service. I will start from the beginning. Almost all of these are review of what I have done in the past. So there are no new surprises.

Let me begin with the entry point of the back end application. It looks like this:

Java
package org.hanbo.boot.rest;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class App
{   
   public static void main(String[] args)
   {
      SpringApplication.run(App.class, args);
   }
}

This start up entry is something I have copied and pasted from one of many my past tutorials, of the past 3 years. I have explained how this works and won't repeat myself here.

Next, I need a configuration class that can provide the MySQL database connections. This is again copied and pasted from one tutorial from my past. I use JdbcTemplate and plain SQL queries and updates for this project. Here is the full source code of this configuration object:

Java
package org.hanbo.boot.rest;

import javax.sql.DataSource;

import org.apache.commons.dbcp2.BasicDataSource;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate;
import org.springframework.jdbc.datasource.DataSourceTransactionManager;

@Configuration
public class DataAccessConfiguration
{
   @Value("${db.jdbc.driver}")
   private String dbJdbcDriver;
   
   @Value("${db.conn.string}")
   private String dbConnString;
   
   @Value("${db.access.username}")
   private String dbAccessUserName;

   @Value("${db.access.password}")
   private String dbAccessPassword;

   @Value("${db.access.validity.query}")
   private String dbAccessValityQuery;
   
   @Bean
   public DataSource dataSource()
   {
      BasicDataSource dataSource = new BasicDataSource();

      dataSource.setDriverClassName(dbJdbcDriver);
      dataSource.setUrl(dbConnString);
      dataSource.setUsername(dbAccessUserName);
      dataSource.setPassword(dbAccessPassword);
      dataSource.setMaxIdle(4);
      dataSource.setMaxTotal(20);
      dataSource.setInitialSize(4);
      dataSource.setMaxWaitMillis(900000);
      dataSource.setTestOnBorrow(true);
      dataSource.setValidationQuery(dbAccessValityQuery);
      
      return dataSource;
   }
   
   @Bean
   public NamedParameterJdbcTemplate namedParameterJdbcTemplate()
   {
      NamedParameterJdbcTemplate retVal
          = new NamedParameterJdbcTemplate(dataSource());
       return retVal;
   }

   @Bean
   public DataSourceTransactionManager txnManager()
   {
      DataSourceTransactionManager txnManager
         = new DataSourceTransactionManager(dataSource());
      return txnManager;
   }
}

This is a little more complicated than the entry class. I am going to repeat myself so you don't have to look up my previous tutorials. First, I need to load some data from the application configuration file called application.properties. This is how I do it:

Java
@Value("${db.jdbc.driver}")
private String dbJdbcDriver;

@Value("${db.conn.string}")
private String dbConnString;

@Value("${db.access.username}")
private String dbAccessUserName;

@Value("${db.access.password}")
private String dbAccessPassword;

@Value("${db.access.validity.query}")
private String dbAccessValityQuery;

These lines load the configuration properties by key. For example, this line:

Java
@Value("${db.jdbc.driver}")
private String dbJdbcDriver;

It will read this line from the application.properties file:

db.jdbc.driver=com.mysql.cj.jdbc.Driver
...
...

Next, I need to define a Java bean that should be named as dataSource. This Java bean is the one that will specify the MySQL connection info for the application.

Java
@Bean
public DataSource dataSource()
{
   BasicDataSource dataSource = new BasicDataSource();

   dataSource.setDriverClassName(dbJdbcDriver);
   dataSource.setUrl(dbConnString);
   dataSource.setUsername(dbAccessUserName);
   dataSource.setPassword(dbAccessPassword);
   dataSource.setMaxIdle(4);
   dataSource.setMaxTotal(20);
   dataSource.setInitialSize(4);
   dataSource.setMaxWaitMillis(900000);
   dataSource.setTestOnBorrow(true);
   dataSource.setValidationQuery(dbAccessValityQuery);
   
   return dataSource;
}

Next, I need a bean that will always give me a JdbcTemplate object, this will use the dataSource bean that was defined previously:

Java
@Bean
public NamedParameterJdbcTemplate namedParameterJdbcTemplate()
{
   NamedParameterJdbcTemplate retVal
      = new NamedParameterJdbcTemplate(dataSource());
   return retVal;
}

At last, I needed a transaction manager. I needed it because I can use @Transactional annotation to the methods. This allows me to commit or rollback the changes depended on the successful of the DB operations within a single method. This is how I defined this transaction manager:

Java
@Bean
public DataSourceTransactionManager txnManager()
{
   DataSourceTransactionManager txnManager
      = new DataSourceTransactionManager(dataSource());
   return txnManager;
}

Once I have the entry of the application and this DataAccessConfiguration class, I can create the API controller and the service class. The Service class is the one that uses the JdbcTemplate to perform the CRUD operations. The API controller will handle the HTTP requests from the user.

Here is the full source code of the RESTFul API controller:

Java
package org.hanbo.boot.rest.controllers;

import java.util.ArrayList;
import java.util.List;

import org.hanbo.boot.rest.models.PostDataModel;
import org.hanbo.boot.rest.models.StatusModel;
import org.hanbo.boot.rest.services.PostDataService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class SampleFormController
{
   @Autowired
   private PostDataService postDataSvc;
   
   @RequestMapping(value="/processPost",
                   method=RequestMethod.POST,
                   consumes = "application/json",
                   produces = "application/json")
   public ResponseEntity<StatusModel> 
   processPost(@RequestBody PostDataModel postContent)
   {
      StatusModel resp = new StatusModel();
      if (postContent != null)
      {
         System.out.println
         (String.format("Title: [%s]", postContent.getTitle()));
         System.out.println
         (String.format("Content: [%s]", postContent.getContent()));
         
         String postId = postDataSvc.savePostData(postContent);
         if (postId != null && !postId.isEmpty())
         {
            resp.setStatus(true);
            resp.setMessage("Post has been processed successfully. postID: " + postId);
         }
         else
         {
            resp.setStatus(true);
            resp.setMessage("Unable to save post entity. Unknown error.");
         }
      }
      else
      {
         resp.setStatus(true);
         resp.setMessage("Unable to save post entity. No valid post object available.");
      }
      
      ResponseEntity<StatusModel> retVal = ResponseEntity.ok(resp);
      
      return retVal;
   }
   
   @RequestMapping(value="/listAllPosts",
                   method=RequestMethod.GET,
                   produces = "application/json")
   public ResponseEntity<List<PostDataModel>> getAllPosts()
   {
      List<PostDataModel> retList = postDataSvc.getAllPostData();
      if (retList == null)
      {
         retList = new ArrayList<PostDataModel>();
      }
      
      ResponseEntity<List<PostDataModel>> 
      retVal = ResponseEntity.ok(retList);
      
      return retVal;
   }
}

The method that saves the object to the database is called processPost. And the URL that maps to this method is "/processPost". In this method, when I get the request object, I will use System.out.println() to print the subject and the content properties of the request to the console. Here is the screenshot when the data is output to the console:

Image 3

The controller passed the heavy lifting to the service object called postDataSvc. It is of type PostDataService. Here is the full source code of this PostDataService:

Java
package org.hanbo.boot.rest.services;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.UUID;

import org.hanbo.boot.rest.models.PostDataModel;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.jdbc.core.namedparam.MapSqlParameterSource;
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;

@Service
public class PostDataServiceImpl
   implements PostDataService
{
   private final String sql_insertPostData = 
   "INSERT INTO `testcontent` (id, subject, content) 
   VALUES (:id, :title, :content);";

   private final String sql_queryAllPosts = 
   "SELECT id, subject, content FROM `testcontent` LIMIT 1000;";
   
   @Autowired
   private NamedParameterJdbcTemplate sqlDao;
   
   @Override
   @Transactional
   public String savePostData(PostDataModel dataToSave)
   {
      if (dataToSave != null)
      {
         String title = dataToSave.getTitle();
         if (title == null || title.isEmpty())
         {
            throw new RuntimeException("Title is NULL or empty");
         }
         
         String content = dataToSave.getContent();
         if (content == null || content.isEmpty())
         {
            throw new RuntimeException("Content is NULL or empty");
         }
         
         Map<String, Object> parameters = new HashMap<String, Object>();
         
         String postId = generateId();
         parameters.put("id", postId);
         parameters.put("title", dataToSave.getTitle());
         parameters.put("content", dataToSave.getContent());
         
         int updateCount = sqlDao.update(sql_insertPostData, parameters);
         if (updateCount > 0)
         {
            return postId;
         }         
      }
      
      return "";
   }

   @Override
   public List<PostDataModel> getAllPostData()
   {
      List<PostDataModel> retVal = new ArrayList<PostDataModel>();
      
      retVal = sqlDao.query(sql_queryAllPosts,
            (MapSqlParameterSource)null,
            (rs) -> {
               List<PostDataModel> foundObjs = new ArrayList<PostDataModel>();
               if (rs != null)
               {
                  while (rs.next())
                  {
                     PostDataModel postToAdd = new PostDataModel();
                     postToAdd.setPostId(rs.getString("id"));
                     postToAdd.setTitle(rs.getString("subject"));
                     postToAdd.setContent(rs.getString("content"));
                     
                     foundObjs.add(postToAdd);
                  }
               }
               
               return foundObjs;
            });

      return retVal;
   }

   private static String generateId()
   {
      UUID uuid = UUID.randomUUID();
      String retVal = uuid.toString().replaceAll("-", "");
      
      return retVal;
   }
}

I split the service into an interface and an implementation class. The code above is the implementation class. In it, I used the JdbcTemplate for handling the data access code. If you review the code above, there is again nothing special. if you are interested, here is the code of inserting the record into database:

Java
Map<String, Object> parameters = new HashMap<String, Object>();
         
String postId = generateId();
parameters.put("id", postId);
parameters.put("title", dataToSave.getTitle());
parameters.put("content", dataToSave.getContent());

int updateCount = sqlDao.update(sql_insertPostData, parameters);
if (updateCount > 0)
{
   return postId;
}

Here is the code to query all the saved rows from the table:

JavaScript
retVal = sqlDao.query(sql_queryAllPosts,
      (MapSqlParameterSource)null,
      (rs) -> {
         List<PostDataModel> foundObjs = new ArrayList<PostDataModel>();
         if (rs != null)
         {
            while (rs.next())
            {
               PostDataModel postToAdd = new PostDataModel();
               postToAdd.setPostId(rs.getString("id"));
               postToAdd.setTitle(rs.getString("subject"));
               postToAdd.setContent(rs.getString("content"));
               
               foundObjs.add(postToAdd);
            }
         }
         
         return foundObjs;
      });

The trick here is that when the request data comes in, it is saved to the database with the correct character set encoding. And the database is configured correctly for the data, so when the row is retrieved, the data are the same as they were. So when the data are retrieved, they can be returned as they were in the original requests.

The POM File

The last thing I like to mention, is the Maven POM file for this project. There is just one property in this file I want to point out, it specifies the source file encoding to UTF-8. This is useful if the source files or the static files uses UTF-8 charset. It also removes the build warning: "[WARNING] Using platform encoding (CP1251 actually) to copy filtered resources, i.e. build is platform dependent!"

Here is how I have specified such configuration:

XML
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
    http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    ...
    
    <properties>
      ...
      <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>
    
    ...
</project>

That is all there is about this sample application. Now all the inside out are discussed for this application, it is time to test it out.

How to Run the Sample Application

After you downloaded the source file in zip, first you should rename all the *.sj files to *.js file.

Then go to the base folder of the unzipped the source code using a terminal (or command prompt). Then type in the following command:

mvn clean install

Once the project is build successfully, in the same terminal/command prompt, run the following command:

java -jar target/hanbo-agular-utf8webapp-1.0.1.jar

Once the command runs successfully, use a browser and navigate to:

http://locahost:8080/

When the web page shows up, you can enter something in the subject line and the content text area, something in English, and in Chinese, Japanese, or Korean. You can also try some other languages like Hebrew, Arab, or Hindi. Then click the green button called "Save". The saved request data will show up at the bottom:

Image 4

As shown I have tried Chinese, Japanese, and Korean. All are retrieved and displayed correctly after saving to database. When I run the query in MySQL Workbench, I can see these in the database table:

Image 5

Both (the page display and the database query) should verify that the end to end application execution is correct. However, as I have put in the disclaimer, I am no expert of using UTF-8 charset in web applications, so there might be something I didn't do and application could just blow up. I hope this won't happen. Anyways, please take the sample project as it is.

Summary

Writing a web based application using UTF-8 as the base charset is something I always wanted to do. But it took so many years until now, that I have taken this first step. I created this tutorial so that I can reference back in case I have to do something similar in the future. I thought it might be hard. Turned out it was not hard as all. I guess at my level of experience and skills, nothing of web application design would be hard any more.

The biggest challenge would be the database configuration. The trick for MySQL would be using the utf8mb4 encoding and not the UTF-8 encoding. Next, I have to make sure that the web page are using UTF-8 as the character encoding and the form use UTF-8 as the charset. There is no change on the Java side because Java string object can store UTF-8, and there is no need of string conversion. As long as the string data is properly encoded (in UTF-8), then the end to end test run with the sample application would work correctly.

Creating this sample application has been really fun. And surely, I learned something valuable. I hope it is useful to you readers. If you saw anything weird, please write it in the comments section of this tutorial. It will be helpful for others and for me as well. Thank you for reading this.

History

  • 12th April, 2021 - Initial draft

License

This article, along with any associated source code and files, is licensed under The MIT License