H13-711_V3.0 HCIA-Big Data V3.0 Questions and Answers
Regarding the basic operation of Hive table building, the correct description is
FusionInsight Hadoop:In the cluster, the cluster size has 70 nodes. If the recommended deployment scheme is adopted, what partitions may exist on the management node?
Which of the following functions can the kafka-clustermirroring tool implement?
Which statement about the DataNodel of HDFS in Huawei Fusioninsight HD system is correct?
When planning a FusionlnsightHD cluster, if the customer does not have performance requirements for functional testing and saves costs, the management node, control node, and data node can be deployed together. How many nodes are required at least?
SELECT a salary , b adiress FROM employee a JoiNsELECT adress FROM Employee info where provine = ' zhejiang) b NO ananme= B. name: " What types of operations are included?
SELECT aa.salarybB. address FROM employee aa JoiN SELECT adress FROM employee info where provine= ' zhejiang ' ) What types of operations does bb ONaa.nanme=bB. name contain?
In big data computing tasks, about0Which description of intensive tasks is incorrect?oneitem?
In a Fusioninsight HD V100R002C60 cluster, which of the following components need to be partitioned for metadata?
In the Output stage, Structured Streaming can define different data writing methods, including which of the following methods?
AAppend Mode
B. Update Mode
C. General Mode
D. Ccomplete Mode
In the FusionlnsightHD product, which statement about the Kafka component is correct?
In the era of big data, which of the following challenges are faced by enterprises?
What configuration files can the Fusioninsight HD LLD configuration planning tool generate?
Which of the following descriptions about the characteristics of Kafka Partition replicas is correct?
What are the key features of Streaming in Huawei ' s big data product Fusioninsight HD?
In the processing node Bolt of Streaming. Which of the following operations can be done
If part of the information is allowed to be lost during message processing, which of the following methods are used to close the message reliability processing mechanism?
In the HDFS federated environment, which of the following contents are included in the NameSpace
Which component controls the primary and secondary arbitration of NameNodef in HDFS
What can the unified certification management system of mainstream manufacturers consist of?
F1ink not only provides real-time computing that supports both high throughput and exact-once semantics, but also provides batch data processing.
In order to consider performance optimization, it is recommended to deploy LdapServeri and KrbServer on the same node in all clusters.
Multiple channels can be configured in the properties.protertises configuration file of Flumef to transmit data
Which of the following operations cannot be recorded in the Fusioninsight HD system audit log?
The capacity scheduler is allocating resources. There are two queues Q1 and Q2 at the same level, and their capacities are both 30. Among them, Q1 has used 8 and Q2 has used 14, and resources will be allocated to Q1 first.
Fusioninsight Spark SQL, like the community Spark JDBCServer, only supports single-tenant binding to a YARN resource queue and multi-tenancy, and does not support multi-tenant parallel execution.
Synchronizing data between partition replication in Kafka, copying data from the leader of the partition to the follower requires a thread (replicationFetcherThread) follower (a follower is equivalent to a consumer) to actively pull messages from the leader in batches, which greatly improves throughput.
A website learning activity. Ask to count the number of user visits per minute to this network. To achieve this requirement, which of the following options is the most appropriate?
In the task scheduling of YARN. Once the ApplicationMasterE applies for the resource, it communicates with the corresponding ResourceManagerj and asks it to start the task
In Zookeeper ' s service model, the Leader node exists in the active-standby mode. All other nodes belong to the Follower node.
Kafka is a distributed message publishing and subscription system. It only forwards messages and does not save messages.
Watermark is a mechanism proposed by Apache Flink to process EventTime window calculation, which is essentially a timestamp.
The main difference between YARN-client and YARN-cluster is the difference between the Application Master process.
Flink is a computing framework that combines batch processing and stream processing. Its core is a stream data processing engine for data classification and parallel computing.
Batch processing of high-value and highly aggregated information and knowledge is the main business requirement of the big data industry
The MOB data in HBaser is directly stored in the HFile format on HDFS. Then, the address information and size M information of this file are stored as value in the store that manages HBase. These files are centrally managed through tools. This can greatly reduce the frequency of compation and split of HBase and improve performance.
F1ink state preservation mainly depends on( )mechanism, which regularly backs up the state in the program.
As an authentication server center, Kerberos1 can provide unified authentication services to all services in the cluster and secondary development applications of customers.
Hive is a data warehouse infrastructure built on Hadoop. It provides a set of tools that can be used to perform extract-transform-load (ETL), a mechanism for storing, querying, and analyzing large-scale data stored in Hadoop.
In the YARN service, if you want to set the capacity of the queue QuqueA to 30%, which parameter should be configured?
Existing server.channelsa=ch1, set the Channel type to File Channel, which of the following configurations is correct?
The smallest processing unit of HBase is Region. Where is the routing information between User Region and Region Server stored?
A Fusioninsight HD cluster contains multiple services, and each service consists of several roles. Which of the following are the roles of the service?
Which of the following commands can be used to clear the data of all databases under the Redis instance?
Streaming:Event listening is mainly implemented through which of the following services provided by Zookeeper?
When installing a Fusionlnsight HD cluster in safe mode, which components must be installed?
The following statements about the reliability of Fusioninsight network security are correct:
Which of the following belongs to the shufle mechanism in the MapReduce process?
In Hadoop, if yarn.scheduler.capacity.root.QueueA. minimum-user-limitpercent is set to 50, which of the following statements is wrong?
FlinkWhich of the following three types of windows can be divided into the following three types of windows according to different implementation principles?
In Spark SQL tables, there are often many small files (the size is much smaller than the HDFS block size). In this case, Spark will start more Tasks to process these small files. When there is a Shuffle operation in the SQL logic, it will Greatly increase the number of hash buckets. This seriously affects performance.
HiveManaged tables and external tables can be created, which of the following descriptions is correct for both types of tables?
By configuring which of the following parameters can the logs generated in Kafkal be cleaned up?
In the Fusioninsight HD platform, HBase does not currently support secondary indexes
In Streaming, exactly one message reliability level is achieved through the ACK mechanism.
HUAWEI CLOUD MapReducel service provides tenants fully controllable - one-stop enterprise-level big data cluster cloud service, fully compatible with open source interfaces, combined with HUAWEI cloud computing, storage advantages and big data industry experience, to provide customers with high-performance, low-cost, The flexible and easy-to-use full-stack big data platform can easily run big data components such as Hadoop, Spark, HBase.Kafka, Storm, etC. Real-time and offline analysis and mining to discover new business opportunities.
It is used to record the message reading position in Kafka (fill in the blank)
Hive does not retrieve whether the data conforms to the schema during load. Hive follows the schema on read (the read mode only checks the Hive data field schema when the read mode is in the mode.
In Spark On YARN mode, the node without NodeManager cannot start the executor to execute the Task
During data stream processing,frequently used systemtimebetween(proceessing time)dofor an eventbetween. About Processing TimeWhich item is described incorrectly?
When importing data into a Hive table, the validity of the data will not be checked, but only when the data is read.
In the MRS platform, which component does the F1ume data flow not need to pass through in the node?
Which statement is correct about worker (worker process), Executor (thread) and task (task)?
Which of the following options for providing multiple Redis1 optimizations is wrong?
Which of the following designs are mainly considered in the planning process of the big data business consulting service plan?
What are the correct understandings and descriptions of the main features of big data?
Regarding the comparison between Hive and traditional data warehouse, which of the following descriptions is wrong?
Which of the following options does the main role of Zookeeper in distributed applications not include?
In the Fusioninsight Manager interface, which of the following options is not included in the operation of the loader?
In order to ensure the reliability of snapshot storage of streaming applications, where are the snapshots mainly stored?
In the Fusioninsight HD system, which component does the flume data flow not need to pass through in the node?
Which of the following is not a characteristic of the MapReduce component in Hadoopl?
SoIrCloud mode is cluster mode. In this modeSWhich of the following services does the olr server strongly depend on?
The HDFS data reading process includes the following steps, please choose the correct order. (Drag title) Order: ECADB
Which of the following statements about the comparison of Sparkstreamingi and Streaming is incorrect?
Which components in FusioninsightHD can Zookeeper provide distributed management support for?
In a Fusionlnsight HD cluster, if the recommended deployment scheme is adopted with a cluster size of 300 nodes, which partitions must not exist on the control node?
As shown in the figure, what languages are supported by the Flink streaming data processing interface Datastream API?
Is the description of the role of the standby NameNode correct in the HDFS system?
In the Hadoop platform, to view the information of an application in the YARN service, what command is usually needed?
Which of the following links is the data conversion operation of F1ink completed?
In the Fusioninsight HD system, which of the following methods cannot view the execution result of the Loader job?
During the data collection process of F1ume, the following options can filter and modify the data:
Hardware failure is considered to be the norm, in order to solve this problem.HDFS has designed a copy mechanism. By default, a file, HDFS will save( )share?
When planning and deploying a Fusionlnsight cluster, it is recommended that the management node be best deployed( ), the control node needs to be deployed at least( )Piece,Data nodes need to be deployed at least( )Piece.
Regarding the alarm about insufficient disk capacity of Kafkat, which of the following analysis is incorrect for the possible reasons?
In many small file scenarios, Spark will start many tasks. When there is a Shuffle operation in the SQL logic, the number of hash buckets will be greatly increased, which will seriously affect the performance. In Fusioninsight, scenarios for small files usually use the( )Operator to merge partitioni generated by small files in Tabler, reduce the number of partitions, avoid generating too many hash buckets during shuffle, and improve performance?
What component does HBase use by default as its underlying file storage system?
Which of the following is not a role or service involved in the process of reading data in HBasei?
The figure below shows the configuration of HDFS tiered storage. If the number of copies of a block is 4. Which of the following statements is wrong?
The order in which the YARN scheduler allocates resources, which one of the following descriptions is correct?
Regarding the ControllerNodeAgent in FusionlnsightManager, which statement is correct?
ACOntroller sends heartbeat to NodeAgent every 3 seconds
B. NodeAgent accepts commands from Controller and executes specific actions
C. Controller must be deployed on each node
D. NodeAgent is open source enhanced
Assuming that the amount of data is about 200GB: the maximum capacity of shards is limited to 30GB, what is the appropriate maximum number of shards?
Which of the following is not a role or service involved in the process of reading data in HBasei?
When installing the Streaming component of FusionlnsightHD, the Nimbus role requires several nodes to be installed
In the Fusioninsight product, which of the following descriptions about kafka ' s topicl is incorrect?
In Hadoop, if yarn.scheduler.capacity.root.QueueA. minimum-user-limit-percenti is set to 50, which of the following statements is wrong?
In order to improve the fault tolerance of Kafka, Kafka supports the replication strategy of partition. Which of the following descriptions about Leader partition and Follower partition is wrong?
When deploying Fusioninsight HD, how many FlumeServer nodes are recommended to be deployed in the same cluster?
When creating a Loader job, in which of the following steps can the number of Maps be set?
Which of the following descriptions about Fusioninsight CTBase is incorrect?
ACTBase ' s read and write data interface. It uniformly encapsulates the interface defined by the line, and automatically merges and parses cold fields without merging and interpretation in the application.
B. CTBase is a cluster table development framework based on HBasel
C. CTBase provides a set of WebUI for metadata definition,Provides a medical-only watch design tool to reduce the difficulty of watch design
D. CTBase ' s java API provides a set of interfaces for HBasej connection pool management. Internal connection sharing is performed to reduce the difficulty of client application development.
Which of the following descriptions about the basic operations of Hive SQL is correct?
Fusioninsight HD Manager interface Hive log collection. Which option is incorrect?
Which of the following descriptions about the key features of F1ink is incorrect?
Which of the following functions can the Kafka Cluster Mirroring tool achieve?
FusionlnsightHD uses the HBase client to write 10 pieces of data in batches. A Regionserver node contains 2 Regions of the table, A and B, respectively. Two of the 10 pieces of data belong to A. 4 belong to B. Clearly write How many RPC requests do I need to send to the Regionserver to enter these 10 pieces of data?
The following aboutHWhich of the descriptions of the Base secondary index is correct?
The label-based scheduling of YARN is to label which of the following options?
YarnWhen doing resource scheduling, maptaak and reduceTask are run in( )middle.
LdapServe in Huawei ' s big data platform can support different types of operations such as query, update, and authentication.
FlinkThe time window can be divided into the following three types of windows according to the different implementation principles?
The secondary index provides HBase with the ability to index according to the value of some columns. The secondary index first searches the index table, and then locates the position in the data table, without full table scan
The combine of MapReduce in the Map phase is a pre-grouping process and is optional.
Kerberos can only provide security authentication for services within the cluster.
The size of the memory allocated by YARN to Containerb in the Hadoop system, which can be set by the parameter yarn.app.mapreduceam.resource.mb
In Huawei Fusioninsight, HBase ' s table design tool, connection pool management and enhanced $DK can simplify the business development of complex data tables.
LdapServerff? ?(group) is to unify usersgroup managementreason. likefruituserAdd to? ?middle.? ?middle? ?record.
Kafka, as a distributed message system, supports online and offline message processing, and provides javaAPIl for other components to use together. In the Fusioninsight solution, Kafka belongs to the Fusioninsight HD module.
The HDFS data reading process includes the following steps, please choose the correct order. (Drag picture title, sort question)
The following aboutHiveComponent capabilities in the architecture. Which is the correct description?
HDFSmiddle,piece(BlocThe size of k) is much larger than the smallest singleYuan,can be minimalchangeSearchingsiteout of the box.
User Rights Management Role-Based Access Control (RBAC), which provides visual multi-group unified user rights management in clusters.
Data in FlumeCompression characteristics are mainlyYesFor which of the following purposes?
The checkpoint mechanism in Flink continuously draws snapshots of streaming applications. The state snapshots of streaming applications can only be saved in the HDFS file system
F? ?middle is ten? ?Lightweight?of?Technology provides Cha? ?mechanism, distributed snapshots can? ?Tap Taak/Opara? ?The state data of the global unified snapshot processing.
