분류 전체보기
-
Term frequency–inverse document frequency(TF-IDF)MLAI/Preprocessing 2019. 9. 25. 01:34
1. Overview A document-term or term-document matrix consists of the frequency of terms that exist in a collection of documents. In the document-term matrix, rows represent documents in the collection and columns represent terms whereas the term-document matrix is the transpose of it. 1.1 Motivation we have a large number of documents: Books Academic Articles Legal Documents Websites etc A user t..
-
Array OperationsDynamicPL/Javascript 2019. 9. 22. 09:05
1. Overview Summarize array operations, such as pop, push, shift, unshift, splice, slice, and split in javascript 2. Description var a = [1, 2, 3]; var b = a.unshift(0); console.log(a); //[0, 1, 2, 3] console.log(b); //4 var a = [1, 2, 3]; var b = a.shift(); console.log(a); //[2, 3] console.log(b); //1 var b = a.shift(2); console.log(a); //[3] console.log(b); //2, only one element is shifted var..
-
MongoDBDB/Nosql 2019. 9. 20. 10:11
1. Overview MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with schema. MongoDB is developed by MongoDB Inc. and licensed under the Server Side Public License (SSPL) 2. Description 2.1 Ad hoc queries MongoDB supports field, range query, and regular expression searches. Queries can return specific fields of ..
-
Spring SecurityFramework/SPRING 2019. 9. 20. 08:05
1. Overview Spring Security is a separate module of the Spring framework that focuses on providing authentication and authorization methods in Java applications. It also takes care of most of the common security vulnerabilities such as CSRF attacks. To use Spring Security in web applications, you can get started with a simple annotation: @EnableWebSecurity. Spring Security is a powerful and high..
-
Apache SparkDistributedSystem/Spark 2019. 9. 20. 00:55
1. Overview An open-source distributed general-purpose cluster computing framework with mostly in-memory data processing engine that can do ETL, analytics, machine learning, and graph processing on large volumes of data at rest(batch processing) or in motion(streaming processing) with rich concise high-level APIs for the programming languages: Scala, Python, Java, R, and SQL 2. Description 2.1 A..
-
KinesisData Engineering 2019. 9. 20. 00:42
Kinesis Data Stream Real-time Data Stream Retention between 1 day to 365 days Ability to reprocess (replay) data Once data is inserted in Kinesis, it can’t be deleted (immutability) Data that share the same partition goes to the same shard (ordering) Producers: AWS SDK, Kinesis Producer Library (KPL), Kinesis Agent Consumers Write your own: Kinesis Client Library (KCL), AWS SDK Managed: AWS Lamb..
-
Transaction ManagementFramework/SPRING 2019. 9. 17. 20:36
1. Overview A database transaction is a sequence of actions that are treated as a single unit of work. These actions should either complete entirely or take no effect at all. Transaction management is an important part of RDBMS-oriented enterprise application to ensure data integrity and consistency. 2. Description 2.1 Core Concepts The following four key properties are the core concept of a tra..
-
Hadoop Yet Another Resource Negotiator(Yarn)DistributedSystem/HadoopEcyosystem 2019. 9. 14. 16:28
1. Overview A platform that is responsible for managing computing resources in clusters and using them for scheduling users' applications. Yarn allows different data processing engines like graph processing, interactive processing, stream processing as well as batch processing to run and process data stored in HDFS(Hadoop Distributed File System). Apart from resource management, Yarn also does J..