Data Warehouse, Big Data, and Green Computing

Introduction


Technology is ever-evolving with advances being devised every day. Some of the technologies one launches today might be obsolete in a short period, being replaced with better and more advanced technology. Some of the latest technological changes that have been developed and are making a significant impact include data warehousing, Big Data, and Green Computing. This paper will address the major components of a data warehouse architecture, the forms of data transformations needed to prepare data for a data warehouse, trends in data warehousing, Big Data and how it is being used personally and professionally, before summing up with ways through which companies can make data centers to be “green” with reference to Green Computing.

Data Warehousing Architecture


A data warehouse refers to an information system that has been designed to store commutative and historical data from either a single source or from multiple sources (Arora & Gupta, 2017). This can play a very significant role in helping organizations simplify the analyzing and reporting processes. An organization that uses data warehousing has a single version of facts and figures stored, enhancing truthfulness during forecasting and decision-making process.

Components of Data Warehousing Architecture


Data Warehouse Database


Some of the components of data warehousing architecture include the data warehouse database. The data warehouse database forms the base of the data warehousing environment. The data warehouse database is mostly implemented in the relational database management system technology.

Extract, Transform and Load (ETL) Tools


The next component is the Extract, Transform and Load Tool which is a collection of tools performing various functions. According to Arora and Gupta (2017), the ETL tools are used to perform all conversions, summarizations, regularly updates data, and the necessary changes needed to convert the data from its raw form to a more unified format that is required by the data warehouse. The ETL tools are also responsible for eliminating unwanted data, anonymizing data, populating data with defaults in case of missing data, calculating summaries, and searching and replacing common names from different sources.

Metadata


Metadata refers to data about other data. Metadata is closely linked to the data warehouse and is used to build, maintain, and manage the data warehouse (Arora & Gupta, 2017). In the Data Warehouse Architecture, metadata is used to specify the source, values, usage, and features of a warehouse data as well as defining how the data can be modified and be processed. It can be classified into technical and business metadata. The technical metadata contains the warehouse information that is used by Data warehouse administrators and designers while the business metadata is used to store details necessary to allow end-users to easily understand the information stored in the warehouse (Arora & Gupta, 2017).

Query Tools


Query tools are components of the data warehouse that are used to allow the interaction between users with the data warehouse system. They are generally classified into four main groups: query and reporting tools, data mining tools, application development tools, and on-line analytical processing (OLAP) tools.

Data Marts


Data marts are forms of access layers used to get data to the users. A data mart takes less financial and time resources to build, making it like a subsidiary to a data warehouse. It is generally created for a specific group of users.

Forms of Data Transformation


Data transformation occurs in various ways. First, data transformation may occur in the form of scripting. This is known as the traditional batch data transformation that is accomplished through scripts done using code written in either Python or Structured Query Language that extracts and transforms the data (Feng, Hannig, & Marron, 2016). The second form of data transformation involves using ETL tools based in the cloud. Firms that use cloud-based ETL tools leverage the expertise of a vendor without incurring costs to purchase the infrastructure.

Trends in Data Warehousing


Some of the current key trends in data warehousing include “datafication” due to streams of data being generated by social media traffic, mobile devices, networked sensors, among other sources (Chandra & Gupta, 2018). Firms have adopted leveraging the Internet of Things (IoT) through the construction of more capable data warehouses. A data warehouse is responding by having added more capabilities to handle new types of data, manipulate the data faster, and handle voluminous data at an instance.
Another trend is the consolidation of the physical and logical infrastructure to reduce costs. Due to the hose of data generated from social media traffic, mobile devices, networked sensors, and many other sources and the need to cut costs, virtualization, compression, and building multi-tenant databases are being implemented to cut costs by handling more data, new data types, and in a faster manner than ever (Chandra & Gupta, 2018). The trend of consolidating physical and logical infrastructure to reduce costs is thus a milestone that cannot be ignored.

Big Data


Big Data refers to the large volume of both structured and unstructured data that can be collected and be used in a purposeful way by a business venture on a day-to-day nature. The amount of data available for use by the business plays a smaller role in defining the term Big Data. This implies that what the business does with the structured and unstructured data that defines the concept of Big Data to a larger extent. For instance, businesses can use the structured and unstructured data available to them through Big Data in order to analyze it before making strategic business moves and better decisions in general. \nThe ability to store large volumes of data over time has been a normal phenomenon for a while now. However, the inception of the term Big Data meant that the large volumes of data stored had to be retrieved faster and many other different data types had to be stored too. These came to be known as the three Vs, implying the increase in volume, velocity, and variety of data stored. For instance, more varieties of data had to be collected from industrial equipment, videos, business transactions, smart devices, and social media platforms. Due to larger volumes of data, the data stored had to be increased to simplify access to the data in real-time. Data also evolved to include unstructured texts, videos, emails, stock ticker data, audio, financial transactions in addition to the previously unstructured numerical data.

Uses of Big Data


Big Data has numerous uses in both professional and personal perspectives. For instance, Big Data is currently being used to professionally fight crimes (Stergiou & Psannis, 2017). Through the vast amount of data, the police databases possess, law enforcement agencies can analyze the data to identify patterns and trends in criminal activities. This can help police officers identify where, when, and how the next crime will be committed by using the “predictive policing”. Big Data can thus help the law enforcement agencies determine where to station police officers to prevent the crimes before they occur. The law enforcement agencies will thus be able to save on resources like monetary costs as well as wasted efforts.
On a personal level, Big Data can greatly help people locate what they want during their daily endeavors. For instance, anyone can now enjoy a more personalized music streaming experience like creating their own playlists or radio stations to enjoy their taste of music. Stergiou and Psannis (2017) affirm that through the use of Big Data, music streaming services can collect information about the kind of music that the person is listening to, then help the person create their own playlists and radio stations from the formulated libraries. For example, Spotify uses Big Data to help listeners create a weekly playlist based on the taste of the user.
The demand placed on Organizations and Data Management Technology by Big Data\nThe current information infrastructure being used by Big Data may not be adequate to accommodate big data sources when weighed on an enterprise scale. Therefore, the data management technology should be revised to establish a means to accommodate for the shortcomings in due time. Making changes in the infrastructure might be costly and the organization might also be required to chip in and allocate more finances for such changes (Sivarajah, Kamal, Irani, & Weerakkody, 2017). In other words, Big Data may require changes in data management technological infrastructure which may also demand the organization to chip in and provide the financial and technical expertise to cater to such needs. \nAgain, there is a possibility of an increased risk profile with the integration of changes in data sources being connected to data management. Improper configuration of the data management technologies and the threat of risks due to data from multiple sources for Big Data, some of which may be unreliable compromises the whole operations of an organization (Sivarajah et al., 2017). The data management technology will thus demand being secured from security threats while the organization will have to bear the costs for the proposed changes.

Green Computing


Green computing refers to the process of using computers and related computing resources in an environmentally responsible as well as in an eco-friendly way. The process will involve designing, manufacturing, using, and properly disposing of computers and computing resources in a manner that reduces negative impacts on the environment (Farhan et al., 2018). Green computing thus enables companies to be eco-friendly and environmentally responsible by using premium energy-efficient resources. For example, the Energy Star program that was established by the Environmental Protection Agency sought to help companies consume considerably lesser power compared to the regular models. This helps organizations through their data centers to provide “green” products to clients.
One of the organizations that have successfully implemented green computing is White Label IT Solutions. White Label IT Solutions has deployed several strategies to make use of green computing. For instance, the company has been able to be an eco-friendly company by using premium energy-efficient ENERGY STAR servers like HP and Dell, which according to the company, consume considerably less power compared to the regular models. This has helped the firm join a list of data centers that offer their customers “greener” product options (White Label IT Solutions, n.d).

Ways Organizations can make Data Centers “Green”


The growing number of computer resources are adversely affecting resources. However, there are numerous ways through which organizations can use to make their data centers “green”. Initiatives have been enacted over the past three decades to ensure that data centers are kept “green”. For example, in the United States, the Energy Star program was formed by the Environmental Protection Agency back in 1992 as a voluntary labeling program to promote efficient energy use in all types of hardware components (Environmental Protection Agency, 2020). For instance, computer and computing resources like the central processing unit, motherboards, servers, and other peripheral devices were manufactured as being energy-efficient to reduce the amount of energy consumed by such computer and computing components. This was a move directed towards promoting green computing. The same concept is currently being used by companies globally which are investing in managing eco-friendly computers, computing resources, and technological devices. Such energy-saving and energy-efficient devices and components can be used by companies to use energy efficiently and more effectively compared to other companies that do not use such computing resources (Saha, 2018). This can help reduce cooling loads for the data centers of such companies helping companies to be “green”.
Companies are deploying employee training on various aspects to achieve “greener” data centers. Some of the practices that companies are teaching their employees to use to help conserve energy and have “greener” data centers include powering down the central processing units and any other connected peripherals like laser printers (Saha, 2018). This will play a vital role in helping companies achieve an eco-friendly environment. Furthermore, companies use energy-intensive peripherals and hardware and encourage employees to switch them off to conserve energy, purchase energy-efficient devices instead of using desktop computers, and activate power management features on such devices to control energy consumption. This will help the data centers to be “greener” through conserving energy.

References

 

  1. Arora, R. K., & Gupta, M. K. (2017). e-Governance using data warehousing and data mining. International Journal of Computer Applications, 169(8).
  2. Chandra, P., & Gupta, M. K. (2018). A comprehensive survey of data warehousing research. International Journal of Information Technology, 10(2), 217-224.
  3. Environmental Protection Agency (2020). U.S Environmental Protection Agency. Retrieved from https://www.epa.gov/ on February 22, 2020.
  4. Farhan, L., Kharel, R., Kaiwartya, O., Hammoudeh, M., & Adebisi, B. (2018). Towards green computing for Internet of things: Energy oriented path and message scheduling approach. Sustainable Cities and Society, 38, 195-204.
  5. Feng, Q., Hannig, J., & Marron, J. S. (2016). A note on automatic data transformation. Stat, 5(1), 82-87.
  6. Saha, B. (2018). Green computing: Current research trends. International Journal of Computer Sciences and Engineering, 6(3), 467-469.
  7. Sivarajah, U., Kamal, M. M., Irani, Z., & Weerakkody, V. (2017). Critical analysis of Big Data challenges and analytical methods. Journal of Business Research, 70, 263-286.
  8. Stergiou, C., & Psannis, K. E. (2017). Recent advances delivered by Mobile Cloud Computing and Internet of Things for Big Data applications: a survey. International Journal of Network Management, 27(3), e1930.
  9. White Label IT Solutions (n.d). What is the meaning of green computing? Retrieved from https://whitelabelitsolutions.com/meaning-green-computing/ on February 22, 2020.